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(57) Abstract 

A system for accessing documents contained in a remote repository, which change in content from vcrsion-to-vcrsion. The system 
allows users to specify lists of documents of interest. Based on the lists, the system maintains an archive, which contains a copy of one 
version of each listed document, and material from which the other versions can be rcconstiuctcd. The system periodically compares 
the archive with current veisions of the documents located in the rcspository. and updates the archive, thereby maintaining the ability to 
Tcconstnjci current versions. The system also monitors access to the versions by each user. When a user calls for a current version, the 
system presents the current version, and indicates what parts of the current version have not been previously accessed by the user. 
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IDENTIFYING CHANGES IN ON-LINE DATA REPOSITORIES 

The invention concerns presentation of a current version of 
a document retrieved from a data repository* The presentation 
indicates changes made in the document since the viewer accessed 
a previous version. 

BACKGROUND OF THE INVENTION 

Information which is stored in computerized systems can change 
frequently, and without notice. As an example, software under 
de\elopment frequently involves nany persons, and is commonly 
stored at a central location- Each person can change the software 
on an ad hoc basis, without knowledge of others. 

In such systems containing changeable data, a person who 
examines information on a given day does not, in general, know 
whether, and how, the information has changed since a previous 
examination. Consequently, the person must spend time comparing 
currently available information with previous versions of the 
information. 

Software exists for facilitating this comparison. For 
example, systems known as "version control systems," or "revision 
control systems," store data which represents multiple versions of 
different documents, as indicated in Figure lA. In that Figure, 
the DATA is indicated, together with dashed loops which indicate 
the VERSIONS. 

The loops indicate that the VERSIONS are contained in, and 
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interspersed among a much larger document. Small arrows point to 
changes, which are primarily additions in this case* The change 
in the "last update" date give an example of text being replaced. 
Here the page's author had highlighted the changes manually with 
small icons as well. The banner at the top of the page was 
inserted by HTMLDIFF. 

Figure 12 illustrates version histories which give the user 
a chance to compare any two versions, or to go directly to a 
selected version - 

Figure 13 illustrates output of W3NEWER, and shows a number 
of anchors (the descriptive text originates from the hot list) . 
The anchors marked "changed" have modification dates after the time 
which the user's browser history indicates the URL was last seen. 
Some URLs were not checked at all, and others were checked and are 
known to have been seen by the user. 

Figure 14 demonstrates use of a SNAPSHOT facility, which 
allows a user to specify an operation on a URL. In this example, 
D0UGLIS@RESEARCH.ATT.COM is "remembering" URL 
HTTP: //SN APPLE . CS . WASHINGTON . EDU : 600/MOBILE/ . 

DETAILED DESCRIPTION OF THE INVENTION 

A TECHNICAL APPENDIX, which is located at the end of this 
Specification, describes the invention in detail. Following the 
TECHNICAL APPENDIX is a COMPUTER PROGRAM LISTING, which contains 
code which implements one form of the invention. 

An illustrative embodiment of the invention is given in the 



4 



wo 97/15890 



PCT/US96/17142 



discussion below. 

Overview of invention 
A commonly used repository of information is known as the 
World Wide Web, or WWW. In the WWW, providers of information make 
their information available to users in the form of "pages." Each 
page is assigned a name, which distinguishes the page from other 
pages, and allows a user to locate the page. 

The WWW provides information using an information retrieval- 
and-display approach called "hypertext." In hypertext, a page may 
contain references to other pages, or other documents. A user can 
call up a pagt^ which is referenced, by clicking on tie reference 
(called a URL, or Universal Resource Locator) with a pointing 
device. Figure IB provides an example. 

In Figure IB, a document D is displayed to a user. References 
R refer to other documents. For example, Rl refers to Dl, R2 
refers to D2, and so on. The referenced documents themselves may 
contain their own references to other documents, such as R4, which 
refers to D4. 

A user can retrieve a referenced document D, by clicking on 
the reference R which refers to it. For example, clicking on Rl 
causes retrieval and display of Dl. 

Under the invention, a user of the WWW initially identifies 
pages of interest. Document D in Figure IB represents one page. 
These selected pages form a "hot list." Then, the invention does 
the following: 
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(a) Copies the hot-listed pages into an archive, which 
is a storage location separate from the and under 
independent control. After the copying, the original 
pages continue to reside in the WW, and copies reside 
in the archive. 

(b) Monitors, at later times, the original pages for 
changes, and archives the changes. 

(c) Records the times when the user later accesses each 
hot-listed page. 

(d) Whenever the user accesses a hot-listed page, 
presents the user with 

i) the current version of the page (which may 
differ from the initial copy which was stored 
in the archive) ; and 

ii) an option to compare selected versions of 
the page. The comparison is presented by 
performing a differencing operation on pairs 
of versions. 

e) As an option, the invention also implements the steps 
described above with respect to documents referenced by 
the page. For example, in Figure lA, if a user is 
viewing document D, the invention can present the current 
version of reference document D2, together with a history 
of D2. 
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More Detailed Description 

Hnl-~T.ist Pages are stored in FXTFRNAL SERVICE 
Figure 1 illustrates a REPOSITORY of information, such as the 
WWW. For assistance in accessing the REPOSITORY, the invention 
provides the EXTERNAL SERVICE which includes: 

(a) SOFTWARE, such as that provided in the COMPUTER 
PROGRAM LISTING herein, 

(b) a SERVER, or other computer, which runs the software, 
and 

(c) COMMUNICATION SYSTEMS which link with both the users 
and the REPOSITOFY. 

The SERVER and the COMMUNICATION SYSTEMS located within the 
EXTERNAL SERVICE are known in the art. As indicated in the Figure, 
the EXTERNAL SERVICE is distinct from the REPOSITORY, and under 

separate control. 

The invention does not disrupt the users' normal interaction 
with the REPOSITORY; the users can interact with both the 
REPOSITORY, as usual, and also with the EXTERNAL SERVICE. Dashed 
arrows 3 indicate the interaction. Several examples will provide 
illustrative modes of operation of the invention. 

^^amnle; Single User 
operation with respect to a single user will first be 
explained. Figure 2 shows a hot list 4, submitted by USER 1, which 
identifies pages A and B as being of interest to USER 1. The 
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invention allows the user to modify the hot list at later times. 
In response to the hot list, the invention copies pages A and B 
from -the REPOSITORY, as indicated by the dashed arrows. These 
PAGES will be termed "base pages." At this time, the originals of 
PAGES A and B remain in the REPOSITORY, and copies reside in the 
EXTERNAL SERVICE. 

Then, the invention periodically examines the originals of 
PAGES A and B, located in the REPOSITORY, for changes. In looking 
for changes, the invention first performs a preliminary check, 
based on information such as (1) dates of modification and (2) 
checksums . 

Dates of mocification may be added to a PAGE by .he PAGE 
provider. These dates directly indicate whether the originally 
archived version has changed. 

Checksums are generated by the invention- An example of a 
checksum is the numerical sum of all characters in a line, or on 
a page. If a checksum changes (indicating that the number of 
characters has changed) , the change indicates a high probability 
that a change has occurred in the PAGE. (In practice, the 
checksums used are more complex than this simple example 
illustrates. Checksums are known in the art.) 

If the preliminary check, either by dates of modification or 
checksums, indicates that changes have occurred, then the invention 
copies the present version of the PAGE into the EXTERNAL SERVICE, 
and compares it with the base page, in order to locate the changes. 
Computer programs for detecting such changes are known in the art, 
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and some examples are given in the TECHNICAL APPENDIX • A preferred 
program, not known in the prior art, is entitled W3NEWER, and was 
developed by the inventors • W3NEWER is contained in the listing 
located at the end of this Specification. 

When changes are found, the invention stores them in the 
EXTERNAL SERVICE. Figure 3 illustrates storage of the changes, by 
the small boxes 6 located below PAGES A and B. The DATEs within 
the boxes 6 indicate the dates on which the changes were saved. 

Figure 3A illustrates how the invention displays the history 
of versions. Column 7 indicates the number assigned to each 
version by the invention. Column 8 indicates the times when the 
respective versions were retrieved by the invention. Column 8A 
allows a user to select a pair of versions for a differencing 
operation, as discussed below. 

For ease of explanation, Figure 3 illustrates storage of base 
pages, which are early versions of PAGEs, together with subsequent 
changes, indicated by the boxes 6. However, in practice, it can 
be more efficient to perform storage in a reversed sense, by 
storing the latest version as the base page (instead of the early 
version) and storing the changes 6 from which early versions can 
be reconstructed. One reason is that users are expected to call 
for latest versions more frequently than early versions. Storage 
of the entire latest versions eliminates the need to reconstruct 
them. 

The changes, together with their base pages, form an archive, 
which allows reconstruction of a PAGE as of any date desired. For 
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example: 

— PAGE A itself (ie, the base page) , plus the changes 
labeled DATE 1, allow reconstruction of the version of 
PAGE A, as of DATE 1- 

PAGE A itself, plus the changes labeled DATE 1 and 
DATE 2, allow reconstruction, as of DATE 2, and so on» 
When USER 1 wishes to view PAGE A, the invention ordinarily 
retrieves and presents the current version. The invention also 
provides an option for reconstructing the PAGE, as of a date 
specified by the user, and presents it in the format shown in 
Figure 4. The program HTMLDIFF, contained in the listing, 
generates the image shown . n Figure 4 . The content of the page can 
be divided into three classes. 

The first class contains material which has not changed. This 
class of material is displayed in the font, size, color, and 
background, as customary in documents downloaded from the 
REPOSITORY. 

The second class represents changes, and contains material 
not present in the base page, but which has been added. Brackets 
9 indicate such material. (The brackets 9 are part of Figure 4, 
and are not necessarily part of the page generated by the 
invention.) This material is presented in a particular font, 
particular size, particular color, and particular background. The 
choice of these parameters can be varied but, in general, they 
should be chosen to maximize contrast with the first class of 
material. In addition to the formatting described immediately 
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above, the added material is further highlighted by arrows 7, 

The third class contains material which was deleted from the 
base page. Deleted material can be handled in at least three ways. 
One, deleted material can be simply deleted, so that the page 
presented to the reader contains no reference to the deleted 
material. 

Two, the deleted material can be deleted, but a reference 
indicating the deletion is added, such as the phrase "Deleted 
material occurs here." In this case, the user can be given the 
option of fetching the deleted, non-visible, material. 

Three, deleted material can be presented, but indicated as 
•leleted, as by "redline" format, in /hich a horizontal line, 
perhaps red in color, is drawn through the deleted material. 

Figure 3B illustrates a display, generated by the invention, 
which indicates which PAGEs on a user's hot list have undergone 
changes . 

Second Example: Multi ple Users 
In actual practice, multiple users are expected to use the 
invention. Each of them submits a hot list. In one approach of 
the invention, the procedure undertaken for a single user 
(described above) is repeated for multiple users: all PAGEs, on 
all hot lists, are copied into the EXTERNAL SERVICE. Then, for 
each hot list, the originals of the PAGES, located within the 
REPOSITORY, are monitored for changes, and the changes are 
retrieved into the EXTERNAL SERVICE, as described above. 

11 
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However, this approach contains inefficiencies. For example, 
a given PAGE will probably be identified by more than one hot list. 
Repeatedly copying that PAGE, for each hot list, would entail 
storage of multiple copies of the same PAGE, Further, repeatedly 
comparing the multiple copies with their originals in the 
REPOSITORY represents a waste of computer time: a single comparison 
would suffice. The invention reduces these inefficiencies by the 
approach shown in Figure 5. 

This Figure represents a modification of Figure 4, to which 
a hot list for USER 2 has been added. The added hot list specifies 
PAGES A and C. 

To process the new hot list, the invention first checks 
whether the PAGEs identified on the added hot list are archived 
within the EXTERNAL SERVICE. Since PAGE A, plus its changes, are 
already contained within the archive, that PAGE is not copied. But 
PAGE C, which is not present in the ARCHIVE, is archived, as 
indicated by the dashed arrow. 

At this time, all PAGEs identified on all hot lists are 
contained within the archive. To emphasize this fact, PAGE A is 
indicated twice: once for USER 1, and a second time by a dashed 
page 14, for USER 2, although, as stated above, PAGE A is stored 
only once. 

After archiving all necessary PAGEs, the originals, located 
within the REPOSITORY, are periodically monitored for changes, as 
described above. The changes are copied to the archive of the 
EXTERNAL SERVICE. 
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Flow Chart 

An exemplary flow chart is shown in Figure 6, which refers to 
a single-user case. In block 20, the EXTERNAL SERVICE accepts hot 
lists from users- Then, in block 22, the EXTERNAL SERVICE checks 
whether the PAGEs identified on the hot lists are contained within 
the archive. If not, the PAGEs are copied from the REPOSITORY, as 
indicated by block 26. 

Then the logic proceeds to block 29, where the originals of 
the PAGES, located in the REPOSITORY, are examined for changes. 
The examination can include the preliminary checks (for checksums 
and dates of modification) discussed above. Whsn changes are 
found, the entire PAGE containing them is downloaded to the 
EXTERNAL SERVICE, and the changes, indicated by blocks 6 in Figure 
3, are derived. Block 32 indicates relevant information stored in 
the EXTERNAL SERVICE. 

As users access the PAGEs, block 3 5 monitors the times of the 
accesses, in order to identify which versions of each PAGE the user 
viewed last. These times are stored, as indicated by block 32 and 
dashed arrow 37. These times are used to determine which changes 
in Figure 4 are to be identified as new material, when a PAGE is 
called by each user. An example will illustrate. 

Figure 7, top, illustrates the time-history of changes made 
to PAGE A. USER 1 accessed this PAGE at time 2, as indicated. 
Block 35 in Figure 6 monitors and records this time (at TIME 2 in 
Figure 7, and not earlier, of course). 
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If USER 1 again accesses the PAGE at time 5, then the 
invention presents VERSION 1 to the USER. However, if the user 
accesses the PAGE at time 11, VERSION 2 had been created since the 
last access by USER 1. The invention had previously identified 
the changes, and copied them as indicated in Figure 2. Now, at the 
access at time 11, the invention presents VERSION 1, plus the 
changes which make VERSION 2, because block 35 in Figure 6 
indicates that the USER has not seen VERSION 2. 

Returning to the flow chart of Figure 6, block 39 indicates 
that, when a USER calls for a PAGE, the invention presents the 
current version, and indicates the changes made (as in Figure 4) 
since the USER last accessed that page. In th*. example immediately 
above, the invention presents VERSION 2 of PAGE A, as in Figure 7, 
and indicates the changes made since VERSION 1, because VERSION 1 
was the last accessed by USER 1. 

The flow chart of Figure 6 should not be read as limiting the 
invention to a linear, sequential mode of operation • In practice, 
multiple users can present hot lists simultaneously, and other 
operations shown in the flow chart can also occur together. 

Third Example: Notification of Changes 
The invention can notify USERs when changes in their hot- 
listed PAGES occur, as indicated by the dashed block 40 in Figure 
6. This notification can take the form bf a flag which is 
associated with the BASE PAGE in Figure 8. When the USER logs into 
the EXTERNAL SERVICE, the invention notifies the USER of the 
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Changes to the respective PAGEs. Figure 3B illustrates one 
approach to identifying PAGEs which have changed. 

.Other types of notification are possible. For example, the 
invention need not wait for a user to access a PAGE. The invention 
can notify the user when changes have been found, as by sending an 
electronic mail message to the user. 

Fourth Example: Common Hot List 

The invention can maintain a predetermined hot-list, for a 
community of users. This hot list contains a list of PAGEs which 
are considered to be of general interest to the community. This 
hot list, and the PAGEs identified on it, are made publicly 
available, to all users, but on a read-only basis. Users cannot 
modify the hot list, or the pages. 

This predetermined hot list can serve as an instructional 
tool, to educate users in the operation of the invention, and to 
demonstrate desirable features. 

one Architecture of Data Storage 
An illustrative approach to storage of the information 

identified in block 32 of the flow chart of Figure 6 is illustrated 
in Figure 8, which is explained with reference to Figure ?• 

Figure 7 illustrates hypothetical changes to the three PAGEs 
identified by the two hot lists of Figure 5. PAGE A underwent 
changes at times 7 and 13. Page B underwent changes at time 10, 
and so on. 
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In Figure 8, the arrows extending from the symbols "USER 1", 
etc., indicate the times of access by the users. For example, USER 
1 accessed PAGE A, VERSION 1, at time 2. USER 1 then accessed PAGE 
A, VERSION 2, at time 9, and so on. 

The invention maintains a TABLE of these times, as indicated 
on the right side of Figure 8, together with a list of PAGEs, or 
documents, owned by each USER.* Ownership is determined by the hot 
lists. The invention also maintains (a) the BASE PAGES, (b) the 
changes to each, and (c) the times of each change, as indicated on 
the left side of the Figure. From this data, the invention is able 
to reconstruct any PAGE, as of any date subsequent to the date of 
the BASE PAGE. 

Additional Considerations 

1. One definition of "page" is that it refers to a unit of 
data, stored in a system, which is identified by a specific name. 
(In the WWW, all pages have unique names.) Other terms can refer 
to such units of data, such as "files" and "documents." In 
general, the particular name used will depend on the system storing 
the data. 

2. One definition of "repository" is a collection of data, 
which is accessible by computer. The repository may be available 
to the public, or access may be limited. In general, repositories 
are expected to be distributed, meaning that the storage locations 
are physically distributed over a wide geographic area, and linked 
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together by a communication system. 

3. It was stated above that the invention can reconstruct a 
page as of any selected date. The reconstruction is based on the 
changes 6 in Figure 3. These changes are detected periodically, 
and the periodicity is determined by each user of the system, 
subject to limits imposed by the designer and system administrator. 

For example, user A can specify a period of one day for 
checking for changes in the pages on user A's hot list; user B can 
specify a different period for B's pages, such as one week. The 
system administrator can specify that no period, for any user, can 
be shorter than one hour. 

Consequently, changes in a page, located in the REPOSITORY, 
will only appear in a reconstruction done by the EXTERNAL SERVICE 
after the changes have been detected, and not earlier. An example 
will illustrate this distinction. 

Assume that the invention looks for changes on odd-numbered 
dates. Thus, a change occurring on the fourth of a month will be 
detected on the fifth. However, if a user happens to call for 
reconstruction on the fourth, the change occurring on the fourth 
will not appear in the reconstruction. Only changes occurring as 
of the prior detection, namely, as of the third, will appear. 

It is expected that the detection process will be performed 
sufficiently often that the influence of this factor will be 
negligible. 
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4. The invention can extend its differencing function (ie, 
the examination of pages for changes) to pages referenced by the 
page .accessed by the user. For example, if the user accesses 
document D in Figure IB, the invention can detect changes in all 
documents referenced by document D, such as Dl, D2, and D3, 

In another embodiment, the differencing can extend to the 
documents which are, in turn, referenced by the referenced 
documents. For example, the referenced documents (Dl, D2, and D3) 
refer to D5 and D6. These latter documents (D5 and D6) can be 
differenced also, as can be the documents which they reference, and 
so on. 

5. The invention provides all information from which a 
current version of a PAGE may be derived. Figure 4 gives an 
example. Figure 4 contains all such information, together with 
other information which indicates changes since a previous version. 

6. The discussion above presumed that comparison, or 
differencing, between different versions of a PAGE was done within 
the EXTERNAL SERVICE. This is not strictly necessary; the 
comparison can be done at any convenient location. Further, the 
preliminary checking for the existence of changes can be done at 
any convenient location. 

7. In data storage systems, names are given to the units of 
information (e.g., documents, pages, records), although the names 

18 



wo 97/15890 PCT/US96/17142 



can be different in different databases • However, the names of the 
units ^ in general, remain the same throughout time, despite changes 
which, are made to the information contained in the unit. 
Therefore, one definition of the term ."version" refers to a unit 
of information, which is different from a previous unit of the same 
name. 

8. The REPOSITORY in Figure 1 is, in general, located 
remotely from the EXTERNAL SERVICE. communication is undertaken 
by any convenient approach, such as a public-access communication 
network known as the INTERNET. 

In general, the REPOSITORY is undtr independent control of the 
EXTERNAL SERVICE. One ramification of this independent control is 
that the type of processing done to the PAGEs copied into the 
EXTERNAL SERVICE is controlled by the EXTERNAL SERVICE, and not by 
the REPOSITORY. For example, (a) the particular processes used in 
locating and storing differences, (b)' the frequency of processing, 
and (c) the mode of notifying a user, are controlled by the 
designer of the EXTERNAL SERVICE. The operator of the REPOSITORY 
has no involvement in this processing. 

9. Figure 9 illustrates another form of the invention. The 
invention maintains base pages 30 within the EXTERNAL SERVICE, as 
required by the hot lists 36. The base pages 3 0 were downloaded 
from respective repositories 42A, 42B, etc. 

The invention periodically monitors the originals 30A of the 
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pages, located in the repository 42, for changes, and stores the 
changes within the EXTERNAL SERVICE. The invention notifies users 
when changes are found in pages on their hot lists (notification 
is not shown) • 

A version control system 39 allows users to fetch and view any 
version of any page. 

10. The different versions of docunents may contain drawings, 
files from which sound may be generated, files which produce video 
clips and animation, and other components which do not consist 
strictly of alphanumeric characters. The invention detects the 
existence of changes in such components, and marks the existence 
of the changes, in the display as shown in Figure A, without 
necessarily identifying in detail the nature of the changes. 

11. A primary use of the invention is envisioned in the 
situation shown in Figure 10. The EXTERNAL SERVICE obtains copies 
of PAGES from a REPOSITORY, such as WW\y. However, the EXTERNAL 
SERVICE is given no authority to replace or modify the pages 
contained in the REPOSITORY. To the EXTERNAL SERVICE, the PAGEs 
represent read-only data, as indicated by the "X" over arrow 50, 
which indicates a write operation. 

The EXTERNAL SERVICE performs differencing between currently 
copied versions of pages, and DATA representing previous versions. 
The DATA stored in the EXTERNAL SERVICE can be both read, and 
written to, by the EXTERNAL SERVICE. The EXTERNAL SERVICE 
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reconstructs any version on demand, and also indicates differences 
between any two versions selected by a user, as discussed above. 
These. functions can be accomplished by a prior-art Revision Control 
System, RCS (also called a Version Control System) , or by the code 
contained in the listing contained in this Specification. 

12. In one form of the invention, the PAGEs retrieved are 
written in a "markup language," such as HyperText Mark-up Language 
(HTML). A mark-up language, in general, contains two types of 
codes, interspersed among the actual text of a document. 

One type indicates how the PAGEs are to be displayed. For 
example, some codes indicate paragraph indentation, ov'ier codes 
indicate font styles, yet other codes indicate style of font^ 
within a font, such as italicizing, underlining, double-striking, 
or bold printing. This type of code is referred to as format- 
defining. 

A second type of code can identify an image, such as a bit- 
mapped file located elsewhere. When such a code is read by the 
system displaying the PAGE, a copy of the image is retrieved, and 
displayed within the PAGE, at the location specified by the code. 
This type of code is referred to as content-defining. 

The invention does not treat changes in the format-defining 
codes as changes in content. Thus, a PAGE which changes in layout, 
or typestyle, only, is not designated as a changed page. 

The differencing program contained in the COMPUTER PROGRAM 
LISTING compares different versions on a subunit-by-subunit basis. 
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For example, the program compares corresponding sentences in 
different versions, and the sentences are detected by sentence 
terminators. (Longer subunits can be used, such as paragraphs or 
pages.) The sentence terminators are a subset of the markup 
language. Specifically, the terminators are format-defining codes. 

COMPUTER PROGRAM LISTING 
The program listing is divided into three sections. 

1. HTMLDIFF, comprising: 

— html_dif f . sml (5 pages), 

— diff.sml (3 pages), 

— mlveb.sml (4 page a), and 

— html. lex (one page). 

2. W3NEWER (17 pages). 

3 • NOHANDS , compr is ing : 

— nohandsBE (11 pages), 
no-hands. eg i (3 pages), 
rcsdiff.cgo (4 pages), and 
snapshot. eg i (3 pages). 

NOHANDS is an overall program set which utilizes W3NEWER and 
HTMLDIFF. 

TECHNICAL APPENDIX 

A TECHNICAL APPENDIX, totalling 12 pages, two of which are 
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blank, follows, and refers to Figures 11-14. 

Numerous substitutions and modifications can be undertaken 
without departing froni the true spirit and scope of the invention. 
What is desired to be secured by Letters Patent is the invention 
as defined in the following claims. 
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CLAIMS 

1. A system, comprising; 

a) means for copying versions of pages from an external 
repository, which provides pages to the system on a read- 
only basis; 

b) means for storing data from which 

i) selected versions can be reconstructed; 
and 

ii) differences between selected versions can 
be identified. 

2. A system according to claim 1, in which the versions 
reconstructed are specified on lists provided by users. 

3. In a system for allowing multiple users to interface with 
another system which contains versions of data, at least some of 
which change as time progresses, the improvement comprising: 

a) means for allowing a user to identify a current 
version and a previous version; and 

b) means for displaying the current version in a format 
which distinguishes the current version from the previous 
version. 

4. A method of communicating with a repository which stores 



24 



wo 97/15890 



PCT/US96/17142 



data units in the form of current versions which change over time, 
comprising: 

a) accepting lists of data units from multiple users; 

b) maintaining an archive^ which contains material which 
allows reconstruction of selected versions of data units 
contained on the lists; and 

c) at intervals, comparing current versions with 
archived material, and updating the archive. 

5- Method according to claim 4, and further comprising the 
step of 

d) recording which versions each user has accessed. 

6. Method according to claim 5, and further comprising the 
steps of: 

e) accepting a request from a user for a data unit; 

f) presenting a current version of the data unit; and 

g) highlighting material within the current version 
which was absent from previous versions accessed by the 
user. 

7. A method of communicating with a repository which stores 
data units in the form of versions which change over time, 
comprising: 

a) maintaining an archive, which includes material which 
allows reconstruction of selected versions of data units; 
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b) accepting lists of data units from users; 

c) for each list accepted, checking the archive and, if 
listed data units are not archived, copying a current 
version from the repository; 

d) at intervals, 

i) finding differences between (A) versions 
contained in the archive, and (B) current 
versions, contained in the repository, and 

ii) storing the differences in the archive; 

e) maintaining information about each user, which 
indicates most recent versions accessed by the user; and 

f) a»:cepting a request from a user for a current version 
and, in response, 

i) copying the current version from the 
repository; 

ii) finding differences between the current 
version and the most recent versions accessed; 
and 

iii) displaying the current version in a 
format which highlights the differences. 

8. Apparatus for comparing a first text written in a markup 
language, which includes sequences of words and markup commands, 
with a second such text comprising: 

a) means for comparing the first and second text, which 
compares sequences in the first and second text which 
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are defined by a group of terminators, including 
sentence-ending punctuation and markup commands, 
belonging to a first subset thereof, and produces a 
result which indicates differences between the texts; and 
b) means for receiving the result and displaying the 
differences in response thereto. 

9. Apparatus for comparing versions of pages in a repository, 
comprising: 

a) means for detecting that a second page is a new 
version of a first page; 

b) means responsive to the detecting means for comparing 
the first page with the second page to produce a result 
which indicates differences between the first and second 
pages ; and 

c) means responsive to the result for displaying the 
differences. 

10. A method for detecting whether a page in a repository has 
been modified since it was last examined by a given user, the 
method comprising the steps performed in the computer system used 
by the given user of: 

a) maintaining a first record of last viewed times for 
pages in that computer system which indicates, for each 
page in a first set of pages, the last time at which the 
given user viewed the page and for a given page for which 
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a last viewed time is recorded in the first record, 
performing the steps of obtaining a last modified time 
for the given page from a second record in that computer 
system of last modified times of pages and determining 
whether the last modified time from the second record is 
a time such that the time is later than the last viewed 
time for the given record; 

b) if the second record does not provide a last modified 
time which is such a time, obtaining a last modified time 
from a source external to that computer system to which 
the user has access and determining whether the last 
modified time from the external source is such a tim^; 
and 

c) if the last modified time obtained from the external 
source is such a time, updating the second record with 
that last modified time. 

11. A method for maintaining a user-defined version history 
of a page, the method comprising the steps of: 

a) receiving an indication from the user specifying that 
a given version of a page is to be retained in a version 
history for that page; 

b) if the given version is the first version of the page 
in the version history, storing a copy of the given 
version in the version history; and 

c) otherwise storing at least any difference between the 
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given version and the most recent previous version in 
the version history - 
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TECHNICAL APPENDIX 



Tracking and Viewing Changes on the Web 



Abstract 

Wc Qescrlbc a set of lools ihai detect when World-Wide- 
Web pages have been modified and present the modifica* 
lions visually to the user through marked-up HTML. Vxt 
tools consist of three components: w3newer, which detects 
changes to pages; snapshot, which permits a user to store 
a copy of an arbitrary Web page and to compare any subse- 
quent version of a page with the saved version; and hmddiff, 
which marks up HTML text to indicaie how it has changed 
from a previous version. We refer to the tools collcciively 
as the NttwwkrOriented HTML Archival, Norificaxion. and 
Differaicing Systm (NO hands). Tliis paper discuues 
several aspects of NO hands, with an emphasis on systems 
issues such as scalability, security, and error conditioos. 



1 Introduction 

Use of the World-Widc-Web {W^) has increased dramati- 
cally over the past couple of years, both in the volume of 
traffic and the variety of users and content providers. The 
has become an infonnaiion distribuuon medium for 
acadenuc environments (its original motivauon), commer- 
cial ones, and virtual conmmniiies of people; who share in- 
terests in a wide variety of topics. Information that used 
to be sent out over electronic noaii or USENET, both active 
media that go to users who have subscribed to moling lists 
or newsgroups, can now be posted on a IV^ page. Users 
interested in thai data then visit the page to get the new in- 
formation. 

The URLs of pages of interest to a user can be saved in a 
"hotUst** (known as a bookmark file in Netscape^), so they 
can be visited conveniendy. How does a userJnd out when 
pages have changed? If users know thai pages contain up- 
to-the-minute data (such as stock quotes), or arc frequently 
changed by their owners, they may visit the pages often. 



Other pages may be ignored, or browsed by the user only 
to find they have not changed. 

In recent months, several tools have become available 
to address the problem of determining when a page has 
changed. One example of such a tool is webwaich, a prod- 
uct for Wmdows^ that uses the HTIT head command to 
find out when a page has been modified since it was last 
viewed by a user's web browser, and generates a repon in 
OTML that allows the user to go directly to those updated 
pages. Another example is winw, by Brooks Cutter, a 
public-domain perl script that runs on UNIX^ [2]. 

jiaf'h of th m tools suffers &om a significant deficiency: 
while they provide the user with the knowledge that the 
page has changed, they do not show how the page has 
changed. Although a few pages are edited by their main- 
tainexs to highlight the mostrecent changes, often the modi- 
fications are not prominent, especially if the pages are large. 
Even pages with special bigblighting of recent changes are 
problematic: if a user visits a page frequenUy. what is 
'*new" to the p*'"^"'*^' may not be **new** to the user. AI- 
tanatively, a user who visits a page infrequently may miss 
changes thai the maintainer deans to be old. 

We have developed a system that efficiently tracks when 
pages change, compactly stores versions on a per-user 
basis, and automatically compares and presents the dif- 
ferences between pages, ho hands Q^erwork-Orierued 
HTML Ardiivai Nonfiaaion, and Diffaencing Syssen) 
provides ••personalized- views of versions of pages 
with three tools. The first. wSnewer, is a more scalable ver- 
sion of Cutter's w3n€w modification tracking tool that pe- 
riodically accesses the to find when pages on a user's 
hotlisthave changed. The second, snapshot, allows a user 
to save versions of a page and later use a third tool, htmldiff, 
to see how it has changed, /fonfefijgrautomaticallycompares 
two HTML pages and creates a **tnergcd** page to show the 
differences with special HTML soaricups. 



Ncuape a a ndtmuk of Ncuape Coomuaiauofu CocpoctUoD. 



^Wmdowt tt c ndcmark of MiooxofL 

^ UNtX is c registered ndemick of X/OpeA. 
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checked. The ihreshold can vary depending on the URL. 
with perl pattern matching used to determine wh2i thresh- 
old to apply. The first matching pattern is used. Table I 
gives an example of a '/.wSnewer^ihresholds configura- 
tion file. Thresholds are specified as combtnations of days 
(d) and hours (h), with 0 indicating that a page should be 
checked on every run of wSnewer and never indicating that 
it should never be checked. 



# Comments start with a siiarp sign. 




# perl syntax requires that be escaped 




# Default is equivalent to ending the file with 


ti mm 


Default 


2d 


file:.- 


0 


hlip://www\.yahoo\.com/.' 


7d 


hup: / / www \.iesearch\ .an \ com/ . * 


0 


hiip:;/.'\.au\.coro/.' 


Ih 


htip://home\.mcom \.com/ home/ wh&isnew ^ • 


I2h 


whais.ne\v\.html 




hitp:/ / www\.ncsa\.^iuc\.cdu/5DG/So^lware/• 


I2h 


Mosaic/Doc$/whals-new\.hl. * 




http://snapple\.cs\.wa5hingtr t\.edtt:600/- 


Id 


mobile/ 




# rarely modiAed 




htlp://www\.C5\.duke\.edu/*pk/- 


7d 


HomePage\.html 




# this is in my hotlist but will be different every day 


http://www\.unitedmedia\.com/- 


never 


comics /dilbert/ 





Table 1: An example of the thresholds specified to 
wSnewer. 



23 Cache Consistency Issues 

Determining when HTTP pages have changed is analogous 
to caching a file in a distributed file system and determin- 
ing when the file has been modified. While file systems 
such as the Andrew File System [6] and Sprite [8] provide 
guarantees of cache consistency by issuing call-backs to 
hosts with invalid copies. HTTP access is closer to the tradi- 
tional NFS [12] approach, in which clients check back with 
servers periodically for each file they access. Netscape can 
be configured to check the modification date of a cached 
page each time it is visited, once each session, or not a all. 
Caching servers check when a client forces a full reload, or 
after a time-to-live value expires. 

Here the problem is complicated by the target environ- 
ment: one wishes to know not only when a currently viewed 
page has changes, but also when a page that has not been 
seen in a while has changed. Fonunately. unlike with file 
systems. HTTP data can usually tolerate some inconsis- 



tency. In the case of pages thai arc of interest to a user bu: 
have not been seen recently, finding out within sotrc rea- 
sonable period of time, such as a day or a week, will usu- 
ally suffice. Even if servers had a mechanism to nciiiy 2li 
interested panics when a page has changed, immediaie .-.o- 
lification might not be worth the overhead. 

Instead, one could envision using something like the Har- 
vest replication and caching services ( 1 ) to notify interested 
parties in a lazy fashion. A user who expresses an inter- 
est in a page, or a browser that is currently caching a page, 
could register an interest in the pa^ge with its local caching 
service. The caching service would in turn register 2n in- 
terest with an Internet-wide, distributed ser\'ice thai wcuic 
make a best effort to notify the caching ser\'ice of tr.:r.zcs 
in a timely fashion. (This service could potentially rci-.ive 
versions of HTTP pages as weil.) Pages would already be 
replicated, with server load distributed, and the mechjr.:sr: 
for discovering when a page changes could be left :o a ne- 
gotiation between the distributed repositor>- and the conten; 
provider either the content provider notifies the reposiior> 
of changes, or the repository polls it pv iodically. E*iier 
way. there would not be a large numbex ^f clients polling 
each interesting HTTP server. Moving intelligence about 
HTTP caching to the server has been proposed by Gweru- 
man and Seltzer [3] and others. 

One could also envision integrating the functionality o: 
NO HANDS into file systems. Tools that can take actions 
when arbitrary files change are not widely available, though 
they do exist [11]. Users might like to have a unified repon 
of new files and pages, and wSnewer supports the "file: " 
specification and can find out if a local file has changed. 
However, snapshot has no way to access a file on the user *s 
(remote) file system. Moving functionality into the browser 
would allow individual users to take snapshots of files tha: 
are not already under the control of a versioning system 
such as the Revision Control System (RCS) [14]; this might 
be an appropriate use of a browser with client-side execu- 
tion, such as HoUava [13]. 

2.4 Error Conditions 

When a periodic task checks the status of a large r.ambcr 
of URLs, a number of things can go wrong. Locai prob- 
lems such as network connecuvity or the status of a proxy- 
caching server can cause all HTTP requests to fail. Proxy, 
caching servers are sometimes overloaded to the pomi of 
timing out large numbers of requests, and a background task 
that retrieves many URLs in a shon time can aggravate their 
condition. WSnewer should therefore be able to detect cases 
when it should abon and try again later ^preferably in time 
for the user to sec an updated report). 
At the same time, a number of errors can anse with indi- 
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vidua! URLs. They can move, wiih or wiihoui leaving a for- 
warding pointer. The server for a URL can be dcaciivaicdor 
renamed. They may disallow retrieval by "robots/' mean- 
ing thai anyprogram thai follows the *tobot exclusion pro- 
locoP [10] will not retrieve them. Since the cost of rctriev- 
ing modificaiion dates is small in comparison to the cost 
of retrieving robou.txi (pan of the exclusion protocol), it 
may well be appropriate to ignore the robot exclusion pro- 
tocol for this task, or to check robois.txi only occasionally 
on each host. Observing the protocol will siill be advisable 
for hosts on which many URLs are checked, especially if 
the pages* contents are retrieved each time. 

Finally, automatic detection of modifications based on in- 
formation such as modincation date and checksum can lead 
to the generation of "junx mair as "noisy" modifications 
tngger change notifications. For instance, pages that re- 
pon the number of times they have been accessed, or em- 
bed the curre-n time, will look different every time they arc 
reineved. 

WSnewer attempts to address these issues by the follow- 
mg steps: 

• If a URL IS inacct Mble to robots, that fact is cached 
so the p^e is not accessed again unless a special flag 
is set when the script is invoked. 

• Another flag can tell wSnewer to treat error condi- 
tions as a successful check as far as the URL's times- 
tamp goes. For instance, if wSnewer runs daily and 
checks a particular URL every four days, normally an 
error accessing the page on Monday will cause it to be 
checked again on Tuesday. With this fiag. it would be 
checked again on Friday. In general, it seems that er- 
rors arc likely to be transient, and checking the next 
time wSnewer is run would be reasonable. 

• When a URL is inaccessible, an error message appears 
in the status report, so the user can take action to re- 
move a URL that no longer exists or repeatedly hiu 
errors. 

In addition. wSnewer could be modified to keq> arunning 
counter of the number of times an error is encountered for 
a panicular URL. or to skip subsequent URLs for a host if 
a host or network error (such as **tiraeout" or '*networic un- 
rcxhable*') has already occurred. Addressing the problem 
of "noisy" modincaijons will require heuristics to examine 
the diiTcrences at a semantic level. * 

3 Snapshots: External Representa- 
tions of Version Histories 

In J'^dition to providing a mechanism for determining when 
^^'^ I 'iges have been modified, there must be a way to ac- 



cess mulii pie versions of a page for the purposes oi compar- 
ison. This section describes methods lor maniain:-^ ver- 
sion histories and several issues that arise with our solution. 

3.1 Alternative Approaches 

There are three possible approaches for providing vcrsion- 
ing of pages: making each content provider keep a his- 
tory of all versions, making each user keep this history-, or 
storing the version histories on an external server. 

Server-side support Each server could store a his:rry o: 
its pages and provide a mechanism lo use ih;:; :ustor> 
to produce marked-up pages that highlighi :r.r:ges. 
This method requires arbitrary conie.i: prcv: jers to 
provide vcrsioning and differencing, so u is nc: practi- 
cal, although it is desirable to support this f eaiurc when 
the content provider is willing. (See Section 6. : 

CUeni-side support Each user could run a progrrr. that 
would store items in the hotlist locally, and rvr. hmil- 
diff against a locally sa\. J copy. This mc:.^od re- 
quires thai every page of iicrcst be saved by every 
user, which is unattractive as the number of psges in 
the average user's hotlist increases, and it also requires 
the ability to run hmldiff on every plau'orm that runs 
a browser. Storing the pages referenced ^y the 
hotlist may not be too unreasonable, smce programs 
like Netscape may cache pages locally anyway. There 
are other external tools such as warmtist [16] that pro- 
vide this functionality. 

External service Our approach is to run a scr\'icc that is 
separate from both the content provider and the client. 
Pages can be registered with the service via an HTML 
form, and differences can be retrieved in the same 
fashion. Once a page is stored with the service, subse- 
quent requests to remember the state of the page result 
in an RCS "check-in" operation tha saves only the dif- 
ferences between the page and its previously checked- 
in version. Tlius, except for pages that change in many 
respects at once, the storage overhead is minimal be- 
yond the need to save a copy of the page in the first 
place. 

Drawbacks to the "external service" approach arc uiai the 
service must remember the state of every page that anyone 
who uses the service has indicated an interest in zrA must 
know which user has seen which version of each page. The 
first issue is primarily one of resource allocation, ar.c is not 
expected to be a significant issue unless the service :s used 
by a great many clients on a number of large pages. The 
second issue is addressed by using RCS's suppor :'cr dai- 
estamps and requesting a page as it existed at a prticular 
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lime. Aitcmaiivciy. a version number could be rciained for 
each <user.URL> combinaiion. 

Relative links become a problem when a page is moved 
away from ihe machine thai originally provided it. If 
the source were passed along unmodified, then the 
browser would consider links to be relative lo the CGI di- 
rectory containing the snapshot script. HTML supports a 
Base directive that makes relative links relative to a differ- 
ent URL. which mostly addresses this problem: however. 
Netscape 1 .1 N treats internal links wiihin such a document 
to be relative to the new base as welL which can cause the 
browser lo jump beiwecn the hmidiff output and the origi- 
nal document unexpectedly. 

3.2 System Issues 

The snapshot facility must address four important issues: 
use of CGI. synchroniration. resource utilization, and secu- 
nty/ privacy. 

CGI is a problem because there is no way ior snapshot to 
interact with the user and the user's browser, other : lan by 
sending HTML output. When a CGI script is invoke, httpd 
seu up a default timeout, and if the script docs not generate 
output for a full timeout interval, hirpd will return an enor 
to the browser. This was a problem for snapshot because 
the script might have to retrieve a page over the Internet and 
then do a time-consuming comparison against an archived 
version. The server docs not tell snapshot what a reasonable 
timeout interval might be for any subsequent retrievals; in- 
stead this is hard-coded into the script. In order to keep the 
HTTP connection alive, snapshot forks a child process that 
generates one space character (ignored by the bnDwscr) 
every several seconds while the parent is retrieving a page 
or executing hmidiff. 

Synchronization between simultaneous users of the facil- 
ity is complicated by the use of multiple files for bookkeep- 
mg. The system must synchronize access to the RCS repos- 
itory, the locally cached copy of the HTML document, and 
the control files that record which version of each page a 
user has seen. Currently this is done by using UNIX file 
locking on both a per-URL lock file and the per-user con- 
trol file. Ideally the locks could be queued such that if mul- 
tiple users request the same page simultaneously, the sec- 
ond snapshot process would just wait for the page and then 
return, rather than repeating the work. This is not so impor- 
tant for making sn2^>shots. in which case a proxy-caching 
server can respond to the second request quickly and RCS 
can easily determine that nothing has changed, but there is 
no reason lo run hmidiff twice on the same data. 

The laiier point relates to the general issue of resource 
utilizauon. Snapshot has the potential to use lax^e amounts 
of both processing and disk space. The need to execute 



hmildiff on the server can result in high processor ioaJ^ 
if the fxility is heavily used. Tnese loads con be 2;ievi- 
ated by caching the output of htnildiff for a while, so jr.any 
users who have seen versions A' and ;V -r 1 or a page coaid 
retrieve hmidiff {page page s^,) with z smglc invoca- 
tion of htmldiff. The facility could also impose a iimit on 
the number of simultaneous users, or repiicsie itself amone 
multiple computers, as many service: 

Disk space is potentially a problem if the repositon.* can 
grow without bound and with no cost to its users, b. :aci. 
before a service like this could be placed on the Imerr.ei. it 
would have to authenticate each user and limit the use: to a 
fixed number of URLs and/or disk blocks. Most likeiy. one 
would use an Internet commerce facility lo charge a fee in 
exchange for permission to store a collection of URLs: this 
fee could easily offset the cost of the storage mediu.Ti smce 
it would also be paying for the differencing service. 

Lastly, security and pnvacy are imponr:i. Because :hc 
CGI scripts run with minimal privileges, from an account to 
which many people have access, the data in me repositor>' is 
vulnerable to any CGI script and any user with access to the 
CGI area. Data in this repository can be browsed, altered, or 
deleted. In order to use the facility one must give an identi- 
fier (currently one's email address, which anyone can spec- 
ify) that is used subsequendy to compare version numbers. 
Browsing the repository can therefore indicate which user 
has an interest in which page, how often the user has saved 
a new checkpoint, and so on. 

By moving to an authenticated system on a secure ma- 
chine, one could break some of these connections and ob- 
scure individuals' xiivities while providing better secuniy. 
The repository would associate impenonai account identi- 
fiers with a set of URLs and version numbers, and pass- 
words would be needed to access one of these accounts. 
Whoever administers this facility, however, will still have 
information about which user accesses which pages, unless 
the account creation can be done anonymously. 

4 HtmlDiff: Comparison of HTML 
pages 

In our experience, only a small fraction cf pages on the 
contain information that allows users to ascenam how 
the pages have changed— examples include icons that high- 
light recent additions, a link to a "changeiog". or a spe- 
cial "What's New" page. As was mentioned in the intro- 
duction, these approaches suffer from deficiencies. Tncy 
are intended to be viewed by all users, but users will visit 
the pages at different intervals and have different :ceas oi 
"what's new", in addition, the maintaine: must expiicitiy 
generate the list of recent changes, usually by manually 
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marking up iheHTML 

Auiomaiic companson of HTML pages and generaiion 
of marked-up pages frees ihc HTML provider from bav- 
ing to determine what's new and creating new or modified 
HTML pages to point to the differences. Thezt are many 
ways to compare documents and many ways lo present the 
results. This section describes various models for the com- 
parison of HTML documents, our comparison algorithm, 
and issues involved in presenting the results of the compar- 
ison. 



4.1 What'sinaDiff? 

HTML separates content (raw text) from markups. While 
many markups (such as <P>. <1>. and <HR» simply 
change the formatting and presentation of the raw text, cer- 
tain markups such as images «IMG src=...» and hy- 
pcncxi references <<.\ hrcf=...» are '^conteni-denning." 
>V'huespace in a document docs not provide any content 
(except perhaps mside a <PRE», and should not impact 
comparison. 

At one extreme, one can view an i.TML document 
as merely a sequence of words and "content-defining" 
markups. Markups that are not "content-defining" as well 
as whiiespace are ignored for the purposes of compari- 
son. The fact that the text inside <P>...</P> is logically 
grouped together as a paragraph is lost. As a result, if one 
took the text of a paragraph comprised of four sentences and 
turned it into a list «UL» of four sentences (each sian- 
ing with <LI». no difference would be flagged because 
the content matches exactly. 

At the other extreme, one can view HTML as a hier- 
archical document and compare the parse tree or abstract 
syntax tree representations of the documents, using sub- 
tree equality (or some weaker measure) as a basis for com- 
parison. In this case, a subtree represcniing a paragraph 
«P>...</P» might be incomparable with a subtree rep- 
resenting a list «UL>... </UL». Hic example of replac- 
ing a paragraph with a list would be flagged as both a con- 
tent and formal change. 

We view an HTML document as a sequence of sentences 
and "sentence-breaking" markups (such as <P>. <HR>. 
<LI>. or <H1>) where a "sentence" is a sequence of 
words and cenain (non-senience-breaking) maricups (such 
as <B> or < A>). A "sentence" contains at most one En- 
glish sentence, but may be a fragment of an English sen- 
tence. All markups are represented and are compared, re- 
gardless of whether or not those maricups are "content- 
defining." In the paragraph-to-Iisi example, the comparison 
would show no change to content, but a change to the for- 
matting. 

We apply Hinhberg's solution to the longest common 



subsequence (LCS) problem [4. 5) (with several speed opti- 
mirations) lo compare HTML documents. Tliis is the well- 
known comparison algorithm used by the Unix diffmU 
«y (7]. The LCS problem is to find a (not necessarily con- 
tiguous) common subsequence of two sequences of tokens 
that has the longest length (or greatest weight). Tokens not 
m the LCS represent changes. In Unix dijf, a token is a tex- 
tual line and each line has weight equal to 1. In hmiidiff. a 
token IS either a sentence-breaking maricup or a sentence 
which consists of a sequence of words and non-sentencc- 
breaking maricups. Note tha the definition of sentence is 
not recursive; sentences cannot contain sentences. A sim- 
ple lexical analysis of an HTML document creates the token 
sequence and converts the case of the maricup name and as- 
sociated (variabic.value) pain to upper-case: parsing is not 
required. 

We now describe how the weighted LCS algorithm com- 
pares two tokens and computes a non-negaiive weight re- 
ftecimg L".e depee to which they match (a weight or 0 
denotes no match). Sentence-breaking markups can only 
match sentence-breaking markups. They mus- be iden- 
tical (modulo whitespace. case, and reordering jf (vari- 
abic.value) pairs) in order to match (see section ...3 for a 
discussion of the ramifications of this). A match has weight 
equal to I. Sentences can match only sentences, but sen- 
tences need not be identical to match one another. We use 
two steps to determine whether or not two sentences match. 
"Die first step uses sentence length as a comparison metric. 
Sentence length is defined to be the number of worts and 
-content-defining** maricups such as <IMG > or < A > in a 
sentence. Maricups such as <B> or <1> are not counted. 
If the lengths of two sentences arc not '^sufficiently close." 
then they do not match. Otherwise, the second step com- 
putes the LCS of the two sentences (where words matching 
exactly against words are assigned weight 1. and maricups 
match exactly against maricups, as before). Let H-' be the 
number of words and content-defining markups in the LCS 
of the two sentences and let L be the sum of the lengths 
of the two sentences. If the percentage (2 . W)/L is suf- 
ficiently large, then the sentences match with weight W, 
Other-wise, they do not match. 



42 Presentation of the differences 

The comparison algorithm outlined above yields a mapping 
from the tokens of the old document to the tokens of the new 
document. Tokens that have a mapping are termed "com- 
mon"; tokens that arc in the old (new) document but have 
no counterpan in the new (old) are "old" ("new"). We refer 
to the "old" and "new" tokens as "differences". 

We investigated three basic ways to present the differ- 
ences by creating HTML documents that highlight the dii - 
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lercnccs with a variciy of markup icchniqucs: 

Sidc-by-Side A side-by-side preseniaiion of ihe docu- 
mcnis wiih common texi vertically s>'nchronized is a 
very popular and pleasing way to display the differ- 
ences between documents (see. for example. Vmxsdiff 
or SGI 's graphical diff tool gdiff), Unfonunateiy, there 
is no good mechanism in place with cuireni HTML 
ancJ browser technology that allows such s>'nchroni2a- 
iion (althoughit might be possible to make adocumcnt 
ihat contained a table with a document per column in 
which rows of the table were used to achieve synchro- 
nization). 

Only DifTerencGS Show only differences (old and new) 
and eliminate the common pan (as done in Unix diff). 
This optimizes for the "common" case, where there 
is much in common between the documents. This is 
especially useful for very large documents but can be 
coni usmg because of the loss of surrounding common 
context. .Ajioiher problem with this approach is thai an 
HTML document comprised of an interleaving of old 
and new fragments might be syntxticaJly incorrect. 

Mcrged-page Create an HTML page that summarizes all 
of the common, new, and old maicrial. This has the 
advantage that the common material is displayed jusi 
once (unlike the side-by-side preseniaiion). However, 
incorporating two pages into one again raises the dan- 
ger of creating syntactically or semantically incorrect 
HTML (consider convening a list of items into a table, 
for example). 

Our preference is to present the differences in the 
merged-page format to provide context and use internal hy- 
pertext references to link the differences together in a chain 
so the user can quickly jump from difference to difference. 
We currently deal with the syntactic/semantic problem of 
mergmg by eliminating all old markups from the merged 
page (note that this doesn't mean all markups in the older 
document, just the ones classified as "old" by the com- 
paiison algorithm). As a result, old hypertext references 
and images do not ^pcar in the merged page (of course, 
since they were deleted they may not be accessible any- 
way). However, by rcvening the sense of "old" and ''new" 
one can create a merged page with the old markups intact 
and the new deleted. A more Draconian option would be to 
ieave out all old material. In this case, there are no syntac- 
::: problems given that the most recent page is syntactically 
correct to begm with: the merged page is simply the most re- 
cent page plus some markups to point to the new material. 
We are exploring other ways to create a merged page. 

.\R example of htmldiff"% merged-page output appears in 
Figure I. .Markups are used to highlight old and new mate- 
r.ai as folious. Two small arrow images are used to point 



to areas in the document that have changed. A red arrow 
pomts to old content and a green arrow points to new con- 
tent. The arrows arc also internal hypenext rcr'erences to 
one another, linked in a chain to allow quick traversai of 
the differences. A banner at the front of the docume,-.! con- 
tains a link to the first difference. Old text is displayed :n 
"sinick-out" font using <STRIKE>. which in our experi- 
ence is rarely used in HTML found on the . Unfortu- 
nately. there is no ideal font for showing "new" icx;. We 
currently use <STR0NG><1>. Ideally, we would like to 
be able to color code the text or text background to high- 
light old and new text, but this capability is not provided by 
current browsers. Another approach would be to choose a 
font that is not active at the pomt of the difference. 

Note that not all changes in the documents are high- 
lighted. For example, new markups that arc not * 'ccnteni- 
denning" (such as <P>) arc not marked up. However, 
markups such as anchors are highlighted. Consider i.^.e ex- 
ample of changing the URL in an anchor but not the conier.i 
surrounded by <A>...</A>. In this case, an arrow wiii 
po • to the text of the anchor, but the text itself will be :n 
its r -iginal font, signifying a change to jusc the URL. 

4.3 Issues and Extensions 

Since htmldiff can parse an HTML document and rectify 
certain syntactic problems, such as mismatched or imssing 
markups, the only real problem it is likely to encounter is a 
set of changes that are so pervasive as to make the resulting 
merged HTML unreadable. For instance, if every other line 
were changed, then the mixture of unrelated struck -out and 
emphasized text would be muddled. We are experiment- 
ing with methods for varying the degree to which old and 
new text can be interspersed, as well as thresholds to spec- 
ify when the changes are too numerous to displav meaning- 
fully. 

Currently, htnildiff is neither "version- aware** nor "web- 
aware'*. That is. htmldiff only compares the text of two 
HTML pages. It does not compare versions of the enti- 
ties that the pages refer to. access them, or invoke itself re- 
cursively on other referenced pages. This has a number of 
consequences. The good news is diat htmldiff does not in- 
cur the overhead of pulling versions from a repository or 
sending requests over the for information. This cost is 
consumed by wSnewer and snapshot. The bad news is thai 
some differences may be ignored. For example, if the con- 
tents of an image file are changed but the URL of the file 
does not. then the URL in the page will not be flagged 
changed. To suppon such companson would require some 
son of versioning of referenced entities and would siso re- 
quire htnildiff to have access to the version reposjtones. 
Full versioning of all entities would allow interesting com- 
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parisons to be done, bui would dramatically increase sior- 
jge rcQuiremenis. A cheaper aliemaiive would be to store 
a checksum of each entity and use the checksums to deter- 
mine if something has changed. We are exploring how to 
efficientiy penorm such "smaner*' comparisons. 

5 Integrating the tools 

There arc two entry points to NO hands, one through 
\\'3 newer and one through snapshot 

Currently. wS newer is invoked directly by the user, prob- 
ably by a crontab entry, and generates an HTML docu- 
ment mdicoting which pages have changed. If specified. 
wjne^^er will associate three links with each document in 
the hoilisi: 

Remember Send the URL lo the snapshot facility, to save 
a copy of the page. Though the page is retrieved, the 
RCS ci command ensures that it is not saved if it is un- 
changes, from the previous time it was stored away. 

Ditr Have the snapshot facility invoke hnnldiff to display 
the changes in a page since it was last saved away by 
the user. 

History Have snapshot display a full log of versions of this 
page, with the ability to run hnnldiff on any pair of ver- 
sions or to view a particular version directly. (Sec Fig- 
ure 2.) 

Thus, each page thai is reported as 'licw" can immedi- 
ately be passed to htmldiff, and any page in the list can be 
"remembered** for future use. An example of wSnewer's 
output appears in Figure 3. 

A user may also to choose to enter snapshot directly to 
check-in pages, or view the current page or the venion his- 
tory. Figure 4 shows the interface to NO hands through 
snapshot. If the user selects the history \ink. the page shown 
in Figure 2 is presented. Fmally. selecting two pages to 
compare invokes htmldiff.ds in Figure 1. 

One disadvantage of the current approach is that there is 
no direct interaction beiween wJnewer, snapshot, and the 
browser. Viewing a page with htmldiff does not cause 
the browser to record that the page has just been seen: in- 
stead, ihc browser records the URL that was used to invoke 
htmldiff m ihe nrsi place. Subsequently. wSnewer uses the 
obsolete daicstamp from the browser and continues to rc- 
pon th2i the page has been modified more recently than the 
browser has seen it. As a result, the user must view a page 
Qircciiy as well as via htmldiff in order to both remove it 
from the list of modified pages and see the actual differ- 
ences. 



6 Extensions 

This section descnbes some possible extensions to the woric 
already presented. Section 6.1 discusses an intenxe be- 
tween RCS and hmxldiff that is already implemented, while 
Sections 6.2 and 6.3 presents unimplemented extensions to 
integrate tracking modifications into the server and to in- 
voke scripts via the HTTP post protocol. 

6.1 Server-side Version Control 

The tools described above do not require any changes to ar- 
bitrary ser\'ers or clients on the W^. Existing GET and post 
protocols are used to communicate with specinc servers that 
save versions of documents and provide morked-up ver- 
sions showing how they have changed. However, if a server 
runs htmldiff and some perl scripts, it can provide a dsreci 
venion-control interiace and avoid the need to store copies 
of its HTML documents elsewhere. 

The perl scripts we have written provide an interface to 
RCS [14]. A CGI sr tpt (/cgi-bin/rlogj converts the out- 
put of rlog into HT\ U showing the user a history of the 
document widi links tu view any specific venion or to see 
the differences between two versions. Another scnpt (/cgi- 
bin/co) displays a version of a document under RCS con- 
trol, while still anodier (/cgi-bin/rcsdifT) dispiays the dif- 
ferences, if the file's name ends in .html then htmldiff is 
used to display the differences, rather than the rcsdiff pro- 
gram. 

As an example, one might set up a Last- Modified field 
at the bottom of an HTML document to be a link to the tlog 
script, with the document name specified as a parameter. 
After clicking on this unobtrusive field, the user would be 
able to see the history of the document. 

6J, Server-side URL Tracking 

Currently. wSnewer runs on the user's machine, so multi- 
ple instantiations of the script may perform the same work. 
Although it runs a related daemon on the same machine as 
an AT&T-wide proxy-caching server, which returns infor- 
mation about pages that are currently cached on the server 
and may eliminaie some accesses over the Internet, there is 
insufficient locality in Uiat cache for it to eliminate a signif- 
icant fraction of requests. 

Alternatively, wSnewer could be run on the set of pages 
that have been saved by the snapshot daemon. Regardless 
of how many users have registered an interest in a page, 
it need only be checked once: if ch^ged. the new version 
could be saved automatically. Then a user could reauesi a 
list of all pages that have been saved away, and get an indi- 
cation of which pages have changed smce they were saved 
by the user. 
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Adding ihis functionality would be useful, since it would 
offer econonues of scale, li would have the disadvantage 
of being decoupled from a given user's browser his- 
lory: i.e.« if a user views a page directly, the snapshot facil- 
ity would have no indication of this and might present the 
page as having been modified. 

63 Interactions with CGI scripts 

Because no hands can handle arbitrary URL£, it can inter* 
act with CGI scripts thai use the GET protocol by passing ar* 
guments to the script as pan of the URL. However, services 
that use post cannot be accessed, because the input to the 

services is not stored. 

Both wSnewer and snapshot would have to be modified 
10 suppon the post protocol, in order to invoke a service 
and see if the result has changed, and then to store away the 
result and display the changes if it has. The interface to no 
HANDS to suppon POST is unclear, however. A user could 
manually save the source to an HTML form and change 
the URL the form invoices to be something provided by no 
Hands. U. in turn, would have to make a copy of its input 
10 pass along lo the actual service. The result would be an 
HTTP equivaieni of a UNIX pipe, interposing an extra ser- 
vice between the browser and the service the user is trying 



to invoke. 

Instead, the browser could be modified to have bei:e: sup- 
pon for forms: 

* It should store the filled*out version ct a torm ;n its 
bookmark file« so the user could jump dtrectiy ;o the 
outputofaCGIsctipL 

• Itshouldbeabletopassaformdirectlyioso Hands. 
along with the URLspecified in the FORM tag. so that 
the output could be stored under RCS. 

7 Conclusions 

NO HANDS combines notification, archiving, and dilferenc- 
ing of pages into a single cohesive tool, it achieves 
economies of scale by avoiding unnecessary HTT? ac- 
cesses, saving pages at most once each time they are ntod- 
ified (regardless of the number of users who track ii:. and 
using RCS as the underlying versioning system. Au [erratic 
generation of differences within the-HTML frame worx pro- 
vides users with the ability to see both inseruons anc dele- 
tions in a convenient fashion. 

In the general setting of the and docunicr.i rcincvji, 
NO HANDS benefits two communities: users ot ihc '//-^ no 
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longer have lo browse to find pages of interest thai have 
changed: HTML providers no longer have lo create suit- 
ably marked-up pages to show **what's new". While such 
automation is dearly helpful in this general coatext, we ex- 
pect that NO HANDS will be a critical part of more focused 
uses of the especially in areas involving collaborative 
and disihbuied work. 

Several issues still need to be addressed. In particular, 
many of the complications of no hands could be avoided 
by better integration with browsers and servers. For in- 
stance, viewing the difference between an older version of a 
page and its current version should update the browser 's no- 
tion of when the page was last visited. Finally, the increas- 
m. .vailability of distributed, hierarchical HTX? reposito- 
ries such as Harvest [1] will be both an opponuntty and a 
challenge for scalable notificauon mechanisms and version 
archives. 
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datai%pe iexresult = 
0?£N I 
ENDOPSN I 
DSCLOPEN I 
PROCOPEN I 
CL05S I 
OPENCOMMENT | 
CLOSZCCMMEN? | 
EQ I 

CONTENT of Spring | 
COMMENT of string j 
STR of string | 
ID of string | 
EOF 

val eof = fn () => EOF 
val fiienazne = ref " 
(* 

fun dprir.rls) = (prir.r (s " ^ (makes tring (0)) " •)\n')) 
•) 

fun dprint(s) = () 

• \1 error = fn (x,i:int) => output (std^err.x * line ' 
(znakestring {1^1)) 

file " ('filename) •"\r*-) 

%% 

%s MU COMM COMMSTUP; 

%stn:cture H^-nlLex 

ws = {(\t\ \n))-; 

nonws s (['^V \c\n<>=!)*; 

%couni 

%% 

<INIT:AL>r'^<>|* => (dprint "CONTENT-; CONTENT! yytext) ) ; 
<:NIT:AL>"<- => (YYBEGIN MU; dprir.t -OPEN*; OPEN); 
<rNITrAL>"< => (YY3SGIN MU; dprinr "ENDOPEN" ; ENDOPEN) ; 
<INITIAL>"<: - => (YYBEGIN MU; dprint "DECLOPEN"; DECLOPEN) ; 
<INIT:a::.>"<?- => (YYBEGIN MU; dprint "PROCOPEN"; PROCOPSN); 
<MU>">- => vnSZGlli INITIAL; dprint "CLOSE"; CLOSE); 
<MU>- — - => {YYBEGIN COMM; dprint -OPENCOMMENT"; OPENCOMMENT); 
<MU>-=" => (dprint "EQ"; EQ) ; 

<MU>\ " [.^\n*]"\" -> (dprint "STR"; STR ( substring (yytext, 1, String.size (yytext) -2) ) ) 
<MU>(r.onws) => (dprint "ID"; ID(yytext)); 
<MU>(ws) => (lexO); 

<COMM>""- => (YYBEGIN MU; dprint "ClOSECOMMENT" ; CLOSECOMMENT) ; 
<COMM>"<- => (YYBEGIN INITIAL; dprint "CLOSECOMMENT"; 
{yybufpos := ( lyybuipos) -1) ; CLOSECOMMENT); 
<C0MM>[^-<>1» => (dprint "COMMENTTEXT" ; COMMENT (yytext) ) ; 

=> (error{"htnl lex: ignoring bad character "''yytext. !yyiineno); lexO); 

%% 
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• !/U4r/locaI/bin/per: • A csiaint a«r.:iSfung perl, to prvvtnc loopuiff 

fvil exec ;erl -5 SC- SU.'S#-» 

if 0; 

t For tf Uctsc inio on rrtd Dcti9l:s's updact ci vi.O of vinowtr. m 
t 

» 

• For CAt Uust ir.£:7aiuon cs arooiu r^tccr s v«rsion ef wImv. ebcck tha URL 

• ftt:?: /.'tiMf .cnc.pAr«dyM.com/c;i*bin/bbeuro?ustr*3cuccmpk9«w3D«w 

■ vjlMWtr vl.: • CzMIM • Htw lui Of U1IL*« 
• 

■ iiodifiio lay frcd 2ou9lis <dcuciis#rMtarcn.ac£.:s3i> 

• sapamt cstr ir.carCaci tr=a poIIitjc 
• 

> WnoMr a progras cAac .^lII uccracr a iisc cf VRL't frsa aithtr your loaaic 

• .-xtlisc. or vili txcract tht L^ s trc= a KTSC tsniaant. It viil than obcaxn 

• *Jic K!TP':.: sc6i!i:acicti cacts for ca:s docuDtr.i liscad. and oucfuc • MniL 

• With laa 'JX:.-s »rctd :y tntir lasc 3ndi!i:it2on ctM. 



• Pacugt: vjntwai 

■ AsUor; Srscju Ciittr :tcu::crf;arady::«.:oai 

t iaiiici: 7 red Sou; lis (dcugi:i«rataar=r..acc.ssu 

• Utasc vtriic.1: l.C 

> Lasc uptiacee. Sar:::. 1995 

• A;c&iva: vjnawtr.ur.9i 

I ilacludas r/trycair? you iiMd u nx it. txcip^ ;arl v4.03SI 

f 

• 

• 'Xhat ic dots: 
I 

• 1. winawtr first txcraccs ztM UlL's frsa your f^tlist or a KTKL docuBtnc. and 

• cachad s=di!i:aci:n daces and accass tiats irzoi ics own cachad 

• biicory and ctia 'mMW brovsar's hisccry flla. 

■ 2. nJiMMtr Chan nands off cbt UU4 to a stparata procass. itfhicb 
I proeaads co do a IRTP/I.O HSW on aach beep: URL chat sivht 

> ba out of data and steras tba Last aodifiad txm of aaeb URL (If 

• availablat. 

• 1. Pinally. vjnawar torts ics oocpuc basad on tha last aedifiad cia« of tba 

• docuMDC and catagoritaa tbatt toy tba aentb chty vara aodif lad. for 

I nen-http URL's or UXL'S for lAicb it can't ratriava tba Last aodlflad 

• tiaa. it vill l;.st tbasa at tba teetoa of ici MTlfL ontpit. 
• 

I This progras was wriccta bacaust Brooks found huuaU fraquancly cbaduag vtb 

• pagts in bis hotlist to sta if cbay had racantiy baan coanodd. Mow 

• &t runs winawar froa cron nigncly, and cbadu its output aacb manixq, 
• 

• pointing wJntwtr ac a list of OIL' a 



• 

• I. Vsicg your Mosaic boclisc by dafault, wJoawar will raad your Mosaic 

» bocUat lila and axtracc tba dociaant URL's and dociaant tltlas. It will 

• look for your hotlist in your bona dicactory as tba flla 

• • / .aoaaxc-hoc list-daf aaic 
I 

t this can ba ovarridan by using tba anviromant varibbia 

• W31IZtfEX_wmJ5T 
t 

• Tbt dtfaulc and ravironaant variabla can ba ovarridan witb tba 

• conand line arguaancs -hotlist.fn or -i vicfi a argfiaant of tba flla in 
» nosaic^hocim ismc. 

• 2. excracc:j\g U7^ s froa a KTML docuacnt if you call wJnawvr witb tba 
I -u argvaint and a KDC docuotnt URL. it will ratriava that dooaant. 

• axtracc all tba links in tea dooaant. and c.-.acx tba aodif icac&oa tioa of 

• aacb link. 

• J. Extracting URL's from a XR& dooaant and islma ■edification tims 

• if you call wJnawtr witb tbt -btal arguaant and a XTML dooaant 

• URL. it viU cxtracc cbe docuaant lu'Jts and rtcritva tba nodificatleo 
» ciMs. ttuc unlikt 12 i-u arguaant). ic will rtplaca tba nodiflcation 

• tiMs wizhin ine dociatnc racjitr than producir.; a sorted list. 
• 

> *htn wlnawtr parses tba docuMbt. it will lock for a tag of tba ton 
i 

• «w3ntwcr url**CJL"» 

• for «xa3|>U 

• <winawtr .;r;t*£::p:/.uwu.acs:.doa/!iia.btBl*> 
• 

« A'btr. 1: i-jsix a cag lika trjt ona anova. ;i w;i: rttriava tba last 
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• Bodif icit:cn tim for c>» 

J for «r. uU,^tio„ e» t.. ^ 

• Known Bugs 



• o Don* Jthat ; 
> Futuia vorJt 



: ' • - 

I ^"^ <»«Uttirtp«t*Jyn..co-J 

prist <ceor; 
•••SIWW" ttubit to stm Wi,#»,r 

Sorty, I eon-t taov i«„, ^3„^, j^,^, ^^^^^^ 

'lout tithtr t«t tht v&riabia * At 

•xit; •• 
--.;•« IIDW( i,ctpj»rojiy) J { 



rsa.ho.att.i 



v;rar.i!tc»:5C. 

•i-r5Ro-*tr.i:r*. 
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• WfttKAS libMM-pczl? 

t eithti in the dxrtctory sptcifitd by LZBWM.Ku, 

• or und«r lit in iw)ntw«r_dir 

•/boM/douQlis/lib/ptri*. 

); 

»wh«cs_iicv • *pftrl $E}iV('LlBiAM_mu.')/FOST ht:p:.'/hocvtb.eAi.aLt.:aoi/-dott9lis/C9i-biA/vtucBMw.e9i' 

• arch&ctccuxfiptcific libraty (tor sys/ioekct.ph; 
r«|uin 'Arch. pi*: 
iuch • urcb: 

if l-d (Slibioc • 'Sw^Mwvr.diz/lib/ arch/Such* H ( 
unthi£tl0Z!JC. SliblocK- 

J 



• TBAU.: Ut's k«tp SOM uMge sues 
aptamUL.M aiiiq tball*!. 
Sdstt • 'dtu'r 

;rint iu:l 'Subicci: w)ntwtrin5SlvrDSEli*l\nSdact>r.* 
:ios»iwaLi : 



»rtquirt Irtqutst.pl- : • pare ot mm now 

rtquirt scat. pi': 

rtquirt 'www.pl*: 

rtquirt 'vwwuri.pi-; 

nqniro 'wwworror.pl*: 

roquro 'wwMdatts.pl- : 

roqoirt ^wuwboc.pl': i pare o2 libwww-porl v0.20 r 1 Utor.. 

unshtettllK.Vappl/ovapiUbM if (-d vappl/ovap't: 
roqpiro 'tvap.pl': 

rtquirt 'parst.htsa.pl' : 

f You sbouldn'c nttd to ccan^* taychio? btlow btrt.. 
S9l.rtqiatse.tiMout • 4S: 

•mchs • ( 

' January •.' Ftbnury •.' warcb April 'Kay June '. 'July * , 
'August*. Scpctnotr . 'Occobtr*. Hovvter'. 'Dtcwter' 

•daytai • Sunday*. •Monday*. "Tutsday . 'Htteotday . *llmrsday' . 'Friday* . Saturday > : 

Swlntwtr.tuthor.boot « 'stuff.ew: 
Swiotwtr.vtr « 'l.O*; 

SwJnmr.author.aaail < 'bcucttrlSvJotwtr.attCbor.bcn* ; 
Sw3aMtr.aucher • «<eOF: 

«a braC>*bttp : / / www. SwJntwtr.attthor_hoM/cgi*bin/bbcuxn>uatr*bcucttr*> 

brooks Cutttr</a> 

cor 

SvJntwtr.oldurl » 

*bctp: I /www. stuff .eM^'bcutcir/taont/prograM/wSntw/wSatw.htad' : 
Sw3Atwtr.url • 
'beep : / / WWW1126 . rtstarcb.act .eoa/-dou«lis/traek_urU/ * : 

Sw3ntwtr .authors • «COP: 

<A HUra*hctp://www.rtaaarcb.a:t.caa/ot9S/asr/ptoplt/d0O9iii/*>rrtd Oattglia</A> 
BOP 



I if OltL isn't fully qualifitd. this is ustd with a«Mmrl*abioUttl) 
Sbast • 'iilt://localbost-. iSENVt'PM)'} |t SDIVCcwd*)! 

SUstrAgtnc » *w3ntwtr/Sw3ntwtr.vtr Swww'Library* : f Sac up Dstr-Agtr.t: btadtr 
• Stt £ha dtfaule Vstr-Agtot 
4tMwsit.dtl.btadcri 'ht:p' . 'Ustr-Afftnt' .SUstrAgaati : 

9hocUst.nttseapt • |*$&Vt'HOKE')/.ntcscapfboooarks.bcal*. 

•SE»V( 'HOKE' ) / .lCOM«booJmarks .ntal ' ) : 
lbitcory.nttJcapt ■ i*$E}iv( 'home* )/.nttscapt-&istory*. 

• SDW 1 • HOW • I / . MCCM-global-hi story • i ; 
•heeUst.BDsaie « *SDfVi'HOiC'l/.BOsaie-hocUst-difaulc*: 
•hmcry.nosaic « 'SDiVt' HOKE 'W.aosaic-glbbal -history': 

$dtfault.browstr • 'nttscapt': 

t This rtquircs tnt tvap.pl lifirary 

5rr viRrwtt 
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• tht (ollowinq opcion$ control ogtpvt proeosainv 
vtrbost. V: switch 

cutpuc.fn. 0: filt • WSOTMatjmc. '-/publlejical/wjatvtr.satput.bwl* 

• If you prtttr co mv« it output to snxxrr. uncotntat zr,% ntxt lint.. 

• eucput.fn. 0: fill • IQHDIDI.HXML. SC4out 
t tiM tollQwuiff eoauol wlMrt co get OtU 

■osue. at: switch 
n«tiCAp«. ns: MltCb 

hotiiit.in, i; iiu . wjurww.iwrusT. 

hi»tory^ln. h: liU • WWWII.MISTOIIT. 

\itl. u: itria« 

bt&l.url. hc£l: icring 
t tboM csntrol procwsiog 

trQB_iddnsB. ttsai scring • DUIL.ADOUSS. ** 

ciMCRSua. es; idoolMn • TRDt 
t thtio control tht fotnit of tht ootput 

dxtplty.Mtdtr. Ih: booUtn • TlUE 

^*pi«y.5cotir. Ai: booittn • TKJZ 

lisplr/.tmioiown. booltAn • 1KJE 

dispUy^rtctflt. t:-. boolttn • TlUt 

dispUyj^^nangttf. booltin > ttstt 

«Bboidtn. switcb 

Utc.visicad. ;■/: cvitch 

txMtciBV.Muret. is: tvitch 

• thMt ecatrsi cacnod latotMtien 
caeiM.ae<!tixts. «: booioM ■ WJt 

ca^fiit. caf: lii* • tfJNOm.otrzu. .wJntMtt.aodtwts- 
csLdtf.tlirtafcald. cmt: itruig • M]inMEI^e3'^F.7KM5X. 'li' 
««nt_lttttt. wl: bOOltM - fALSt 

c^cbsol«ti_thrtJhold. cttot: string ■ VimajOtJitiJXtSSH. '3d* 

cictM.thrtsholdt.'n. ctn: filt • MKT ^.CTTIU .vlRMtr.tiizttaolds* 
> mKtlUatoua optioai, t«p. tor dtteigfiig 

4olMi«. latogtr • 0 

Mx.uslt. Mx.wli: inttgor - -1 

niltijistr. 8u: stfitcb 
t nop. a: switcti • fast 

• iattgrtite into aoium* 

• wJntwtr.cs.filt. c$f: filt * MinoanjCsriXX. .^lnwT_zb»CKimt' 
supsboi. s: mcch 

Maptiwt.uil. su: string • WJKDnx.SHAP.UXL. 'http:/.'rtdisa.rtstatca.«tt.coa/cgi-bxn/r.o.hM4s* 

xgnott.nobots. ign; sviteb 

foiBS. fcras: svitcb 
rotno optional^Cilt.list 
EOT 

»«vip^t « spUtiAn/.Stvap^t): 

StVSp.Bn •«tor: 
wjfwwtr vSvJnMtr.vtr 

v3ntvtr is « progtaa Out will txtrsct a list of UU's ezo0 • botlist 
(titbtr Mttscapt or Itosuc tozMt). or will oxtitet tbt gxL's frsa «n 
HXKl docunnt. :t cbaeks tbt aodif lutioa dttos of URU that hm not 
bttn cntckod caetntly and products a rtport of QKLa that bavt btta aodlCitd 
tioct tAty wtrt vkiwad. 

St tnts not to poll chiaga too of tan. It vaas tbt following ttcuiqots: 

• If your browttr's bistory shows MsttbiAg has bttn acctsstd rtctntly. it 
isn t ebacJcad. 

- i: v^ntwtr cbtcJttd scBiatbiAg rtctntly. it isn't chacktd. 

- If vlatwtr knows soMtbing has cbangtd siact it was last visitad. 
•t narks it as ntw but dottn't cbtck for t aort cvrrant datt 
ttftlass It dttBS tba ■odification datt infozaation to bo 
'obsolatt* istt documntatien tor c^obsolata.thrtsboldl . 

• Tba difiaieioR of *racant* can vary trm ISKL to ClsL. batad on 
iptciflcatiens in a 'throsholds* filt. 

Currrrrly. wlatwtr only handles bttp URU. not ftp. !iit. or othazs. 
"Tills ray cnangt. 

Exaa^its: 

wintwtr 

wJntwtr •usagt.btlp 
Wntwtr -V 
wJntwtr -full.htip 

rfjntw*r -M -0 -iv <ottacapt. bold, last vistttd. vtrbosti 

vtrbost 

Whan vtrcost is sat. uJntwtr will print out tht naat ot aacb imi 
;t cntcu and ^tnaratt sonc otntr status infomaticn. 
. :r.tciu;:r 

'>C;ar. :r.a:Ksuft is sat. wjntwtc will uat cbtcxsuas zz ;tll it soMtaing 
r.as star, sciifit^ tnt poditicatson datt is *.j»vailac:t. 
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Tbis specifics wiurt th» HTML output is plsesd (stdout ixdicfttss 
standard output). 
.displiv.hMdtr 

datsmnts wnttatr or not CO print t hoadir on tha XTML output 
.displsy.'ootar 

datt:sunas vbsthar or not to print a toot«r on the ifRQ. output 
.dUplay.unlnom 

datcmsas vhathar or not to display tests tor wbic*: it cannot 
£ind iCrTP/l.O Ust aodiCicacien datts 
.dispUy.raesnt 

dataraints wttttSMt or not to display DRLs that wars racaatly accaasad 

and r^: cnacksd during thia no. 
. ditp Uy_unc&anQ«d 
datirair^i whatnar or not to display OXLa that hava not cbaaffad 
sinct cfl«y wtra last aaan. 

.hotlU:.*-. 

ttm filanaaa o( your Nttseapa/Mosaic botUst. iy da fault it is sac to 
cht aim -.'.nasscapa-boomarka.&tsa or o/.aosaie-netlitt-dataolt. 
daparxing on viittAar tha 'oatscapa* or *Bosaic* option is sat. tha 
iitimt cm bt svarriddan vith tha anvironaant variabl* if3liEilQi_K(KLXSr. 
.Mttcr/_!n 

Tk9 t'J.mum 3! your Nataca^ /Mosaic history. By dsftuit it is i«t to 
tht til* .ntucapa-9lobal>Mstory or -/.aosaic>9lcbaI*nistory. 
dcpt.*.=ir.; cn vtitthar tha 'natscapa* or 'oostic* option tt stt. Th« 
6tU\:lz zKi oc r/arriddan vith tha aavironatnt vtriahit iiJinENEX.KISTOItr . 
url 

vlnavo; vill rstriava tha imfi. £ila spacitiad by this arQUSMnt. 
axtra:: all r.y;arlinks and iM«as from tha doetMnt. and 
prectal :a csaca tha last aodificacion cioa tor aacb DXL. 
.htad.cr! 

it yz-j call v3ftM«r with tha - tai arguaent and a HTML docuaant 

ORL. i: VI U exuact tha rtocun. .t llnJca and ratriava tha 

■odilication tiaas. but ualika tha i^onrious option <*u arguaant). it vill 

rcpUca tha aodification tiava witbln tba deamnc rachar cban 

producing a sortad list. 



Nban v)navir pariaa tba doeiaaat. ie vill leek for a cag cf tha fozv 

<v)navor urU*OIIL*> 
ror txai^la 

<w3iMvtr url«'http://iMv.host.dOB/fila.hCBl*> 

Mban it finds i tag lika tba ooa abovt. it vill ratxiav* tha 
last aodification ciaa for tha docoanc spaeifiad by tba url« 
luM. and include tbo last sodification of chat DHL (if it 
ousts > 
.froa.addnas 

By default, tha prograa vill sat tba RTTF/1.0 rroD: header to 
your tlsanuDe tspscifiad by OSDt aavixoannt variable) and your 
hascnae*. If your hostnaae isn't fully goalified tie: doesn't 
have a *.* in it) then it vill run 'doaalanaBa* and tack on Che 
result (if any). 

Ibis value is slso used for tha anapshot arguaents isee be low I . 

Ibis valut can be ovarridan by sotting the envirooaant variable 
D!AU..JUiail£SS cr paasmg your «ail address vith tha -trott flag 
.sulti.usar 

tbis flag IS not yet iopleaancad. It raf ars to sharing ■odlfieatloo 
dates aaong ultiple users, 
.netscape 

tbis liag indicates that Hatacipa boohaark and history files should 
be used, 
.aosarc 

ir^s flag indicates that Hosaic hotliat and history files should 
bo usee, 
.ioboldtr. 

Tbis flag indieacas that certain icaatt chat are daesMd to be of 
speciii intirest should be generacad in boldface. Currently this 
refers iz the dates of itns that have been aodif led since thay vara 
last I sen. 

.lasc.vitited 

Tbis flag indicates that tha tiaa vban a (JU. vas last visited, 
acccrtir^ to t& history file, aboald bo included in WJoever output. 
.ciMsti=7.seurce 
Ibis !iag indicates that the sooxca of intoiaation for a VXl 
tprcr/^tacniog server, polling, previous na. etc. ) should bo 
irfl-^ts: in v>oever output. 

"^tii specifies the file that cacnts modification dates of 

171*. :r.e ^fa.;t can be overridden by tne environoent variable 
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7^1% optiec iiidieacts chat aodification dacM should be cactad. :! 

JSLSl. taji'f IS igocpH. « 
.cz.iff.cturtihold 

Ttis option sptcifiM tiM daCauJt fraquancy for chtcking UKLc. in 

Ua torn HA. Mb, Nd. M# for N ■imicas. houii. dayf . or wmu. raspaccivaly . 
.vant.latasc 

Zi chis tvicch is tat. always gac ciia cvrrtnt aodKicacion datt o{ » 
shat baa not baan accaasad ractndy. racnar clun aaippuig it ;f 
Its knoMi aodifieacien data ia aon racant ttaaa cha tiaa it was 
visicad by caa teewtar and not bayond tJw data spacifiad by tba 
•eaec switch. 
=.;bt6lata .th;athold 

TMa switca spccifias whan a cae&ad oodificatiosk data should not ba 
c:naidar»d zz sa nrrtnt avan it it indicatas tbat a UM. haa not 
aaan sacn sinca it waa aodif iad. 
ca:r4.tbrasholds.f n 

'tit opti:n spaciCias tha tiU that apacifias bow of tan diffarant 

Should ta chackad. tf it doaa not axiat. than all USU ara 
:u:icad a fraquancy datarainad by tha -codt optisr.. Tha Sccaac 
li ztit tiU is: 

UM. <Hlutasp«c«> tiaa.apacification 

dafauit tuM.spacificatAon 

Obsolatt tiaa.spaeification 
vtlault cvtrridas tbt valuo of -cadt. Obsolact ovomdas •emt. li 
:;r4_8pac:ixcation is *navar* cha ORL ia akippod avm it it ts found 
;r. thi hotl.si. tha UM. can ba a pattam to natch against ipsrl tagaap 
•>*ntaa) . 
.iar^f 

Vfctn dabug is > 0. Iditional dabuggutq output la ganarattd. Dit hi9har 
zna nuiDcr. tho oori eucput thara u. 
.aax.urls 

Ztis option spscifios a liait on tha miatoar of DHLs to procass. 
;ri£aiily ict dobugging purpoaaa. 
.snapahoc 

li sat. tba output will contain links to th« snapshot CSX dsaaon. 
.snapsbot.url 

*JXL CO uat tor saapabet. 
. ignora.aobou 

Sat in ordar to bypaaa aavad iBferution about vhich tnoa diaailow robota isll or nenai . 
.!cxaa 

Sat to produca a ton for aaeh URL rathar than diraet hyparllaka. 
Eor 

•avsp.ai • spun / \n/ . »avap^) ; 

dit 'Error tatonad froa avip\n* if U avapC'avtp^t. 'avap^t 1): 

• t! eroei_addraas waa aat. sat tha dafault haadar.. 
ii iSoptioiu(*fraa_addrasa'|} I 

•WWW' aat.daf .haadar t *http' , 'Pro' . SoptioasC * fro«.addraas' ) ) ; f * 
) alsa I 

locall$asar> • SOIVCUSIX') i| gatlogln tt (gatpwuid(S<))tO| || 'Xntrudar!*': 

• loeU($beat> - lOlvraOST') 11 choptSfoo • '/uar/ocfo/hoataaDa'; || 'UbJaom aoat*: 
lecallShoat) • ShoacaaM'rqaii 

^ Sopt&eoa(*troai.addr«sa') • jolarl*. Suaar. Shoat&Ma'fQERl; 

Soptien^string • **; 
for J'Diff. 'MMater*. -History M t 
Septson.suuig •<optien> *: 

) 

:! (iSoptionsl'output.fD'J aq '>-'! || ISoptiooal'outpat.fn') — r\$*\|/l) ( 
opantoirr. $optiona( *outpat.ttt* ) ) 

il dit 'Can't writs to $options( 'oucpuc.fn'l : S!\n'; 
i aist ( 

;octi{Solteask» • imak: 
it i-a $optioiis('output.fn')l ( 
loealiaaryi > statl.); 
local itaodoi > SarylSSr.NOOCI ; 
uaaaklSoltesk | (-Saoda k 07711); 
if i-s Soptionsroutpttt.en')l ( 

ranasttiSoptionaCOtttput.fn'l. 'Soptionsf *output.fn'}-*i |) 

print STOSXM 'Haraiagt can't tanana Soptiensi 'output fa')Vn*: 

I 

c^a^icur. *>$options( 'outpvt.fn' 1*1 

:i dia *can't vrita to Soptionst'output.fn'l : $!\n*: 
^sk S»ldnak: 

£:;-.;.:raCvtr9cta-| • 1 it toptional 'dabug- 1: I dabug isplits ^rbesa. 
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it i!(Scptior.sl'ncs«ic-) || SopcionsCnaucap*') || 

S9ptions('te:::si.en't 11 Scptionsi urr) |{ Soptionsi Mtffll.url'))) ( 
$cp:ioni($daU.:;:.bxoMtr) « 'TWJX*: 

) 

sub acsucburl : 
loc»l iSurli • 

it iSoptiOttS(-t«9u9'i * Ot ( 

prin:! STZUJl *Cet a URL iSurll wt didn't nk tct. Wt kr^ aiwut:*: 
lee«i iSu:: 

tar Su tisr: <(iys turlsi ( 

pr&stf inODUl *\t%svD*. Sul; 

I 

I 



teia«ctv itil^ «n«Mi> ( 
rtt'jrr. iliit: 



if tSc?::o.is« -aoii::- )i { 

Sop;xons('hist:r/.!r>' I ■ fcfir.d.first.vxistuivlVtviStory.BOMic; ; 
} cli:! (Scptucst ::tuc>p«')) ( 

$epuoiu('!i&st:r/.!n') - fcCind.firsc.tusciBgdhitcor/.Mueipt): 



i&loAd.cbtciuum SopLionsl'ebwksuB'll: 



« rtad lA eacbtd ssdiiicauon daus ehteksuns. if any. Than if w* 
I Mt UKU vith ZKtJtA into, updata clia cacHad info basad on any nav 
I Lxilo. a. 9. 'laat aaan* dafcas. 

if iSeptionst'cacM.aodciaas*)) f 

«cacAad.B0dC2Ms - 4raad.cacbad.aodtiaaaiSeptioasl'CB.fLla')i: 
» aUa I 

leacAadjBodUMs • (I: 



IS (Sopuonst 'biatcry.fa* I rw *') t 

%viaic.tiBa» • fclead.histetvi$epcioiia('hiacoiy_Ca')»: 



it (SoptioQsfvrl'll I 
Surl > fcwMuxl'assolutalStaaaa.Soptienaf'iirl*)): 
it i!4Mfwiiot-aU5wadl$url.$Us«rA9anc)l If 
dit 

'Sorry, but it* raooct sita doaa not allow rataeu to ratriava that OU.ui*: 

) 

%urls • 4axtrset_:inks.dase($url}: 
) alai! iSoptisnirscAl.url'M I 

Surl ■ fcwwwurl'irsoluca(Shast.SoptionA('bcml.ttrl*)>:t' 
:1 lifcMMsoc-alUwadtSuxl.SUaarJbgant)) {%' 
dia 

•Sorry, k; raaota iita doas not allow robots to ratrir^ that UM..\W; 

) 

print aT 4display.£sliSurl}: 

a«xt; ^ 
) tlst ( 

t ust -hotlis:.** naatc to hava qualitlad pattazn ntcbanv m call by raf - 

• :s thia ntcastary? 

UcaKthotlist.'^ls. thotlist.datas) : 
Uoad.^tlisttSopticasCitotlist.ta-}. *hotlist.urls. •hotlist.dataa); 
%url» ■ \hotlist.uri»: • 
tutlt.Atcaatad > k.'^ilist.datat; I 
for iur- iittys { 

'A 'Surli.acceiiedlSurll :vi$xc_ti«a*(Surl)) ( 

;rint srr?^ -xtraia?. •iiit tuaa froa hotlisc for Surl sawar tban 9lcUl his::r;-.*..r it Sopticns( dabttg') U Wuxt.ttt»«!Ssrl» » 0 
S7is:.t.:;:as<$u:l} • Scrii.accasaadtSurli ; 
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£elt£c sspcisns; crooldtv}: 

sut ciresoold { 
Itcal Hurl I • 

iertach Su :#tsrtshclds> i 

it e-.-iuSi; i 

prir.c: rr:i?J( 'Osm; cvMbQld Scr Is tsr m %s\ii*. Su. Swi it Sopcioasl'tfttoug' ) » 0: 

t 

:r:»:i«ur.;:s » r . SC. r.'. 60. 'd'. Jl. . 

<SC);tur.:.:. Sir.;»ul:. Sunicsi • #uiutJ: 
S-«JUCs<S:r.is.:.':i:. • ScMsaulc * Scurrwt: 
Scurrtr.: • £ur.i:t!Sc!i2B\aiC): 



;c:*i l$cr:5_:orMr.» • 
Izzii tSm. :nua.iur.:t.Stl • 0: 
i: • Sotig.cArtsh: 
vfiiU ISC ni **! : 

it ($c /Mvtr/) ! 

ncuni -1: 
} «icie lit - /*i\d*M(Dihdw|)(.*l/l I 
SnuB ■ SI. Suaic • S3. St • S3: 
SB «• snua • suiucsrMStoue*): 
i !St t'\6*sn { 

SB St: 
rtcuiB Sh: 
1 tU* { 

dit 'Ca^ s parse zftrtahold Sori«.cbmb*: 



rtcnin Sa: 



sub ttcs2scr I 

local I Si I • 9.: 
local ISu.Stts.Sn>: 
il IS» « 01 I 

wan *Ht9a:iv« cim issi p«ss«d to rauunt*: 

rtcurn 

} 

SXtS • *•; 

Scr Su rw. d'. -fc-. ( 

dii 'Uiuts r»t iBit&aittd* if idafiiMdtSu&icsUu)): 
Sn • 0: 

whilt iSs >• SunicstSu)) i 
Sn**: 

Si •« SunitslSu't: 

Srtft • 'Sr-Su* if Sft: 

srts t it $j; 

S:as • ;J Srss aq 
Sras: 



ii -.-t Scptioasl cac-it.tArvsiislds.fn')) I 
::cal(Sstc.d«:ault.Sstt.:osoUta> > 0: 

t^er-'.-ncilSK. Ss;ti:a${*cacna.tiir«»holdi_rn- )) II dit 'Can't epta Sopuonsl'caclM.tlirtabolds.in't*: 
;r ir.: • I STtL" •T^.^•»w Id* ; -.nM if Sopt ions t • datiu; ) > 2: 
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ScpcionsCca.del.Urtihold*) • 4iCf2stc9lSl): 
Sstc.difAult • l: 

prir.::!STE£?Jt *\ti-)Os: M\n*, *DiC«ul(*. Sopcioiui 'cm.^CchtMhoItf' t) i£ SepciensCdtbug'} > 3: 

Scpcicasrcs.obsoiiCt.cartsboldM « Mu2tKS($ll: 
SMC.CbSCitCt • 1: 

?ri3tf(S7r£?» 'nl-SO*: \6\n\ •ObsoUW. SppiloMl 'CTi_ohipl«tt.tlirMbcld' )» ;f SopuCBil 'dtbog' I > 2; 
} •Ui! '/'M.M ( 

atxt: • tkip eeaant 

i cUt ' 

IxaKSttJl. Smini « tplU(/^i«/l: 
if -.lain nt 'M ( 

SciirtshcldtSurU • fcsur2jtcslSala) : 

pushilchrtsnolds. Surl): 

pr:=tStSTtE» **.t%-30i: %d\ii*. Surl. SchrtsholdUurll) it SoptienaCdtbug* > 3h 

%c^zit7j ; ca.^el.tnrisbold' ) ■ fcscr2s«citSopciORSt 'ca.daf.^hraihold' } i : 

:; 'Sstt.rriDUtt- : 

£cpti:r.s ; crt.criaie-e^t.^rts.iold* ) ■ 

>i:risccs :Scptionsr:9i.oiDsolate.ctar«sh.oXd' K : 



SuxU^roctittd • 9; 

tafftf.uris • Hitjiili • »iJiippad.ttrU > •iMw.chkflws « •old.chksws ■ 
•inacctssicit.iitls > tneoots ■ O; 

print STDEM u zim, Mn- it $optionjCd*bug'); 

• eraatt pipts ctU i» cbc Mnrat — this tbould be ctangad to um POST 
» dxrtecXy. 

piptfCETURLS. SClxnaS] M dli 'CM'C cnatt pipt: t!*: 
pxpttcrrmr.-rH. semousultsi || dla •Cu-s crwct pipe; s**: 
;! iSpid • *orici ( 

clOftJCnVXJLSI: 

ClosttSE»DKZ5ULT5l: 
} tisif (dtfir^^ $pldi i 

eXostlSCOntlSi: 

clostiCCniESULTSi: 

optntSTXlT. '>4SIHQRESin,T$*) t) di* 'Can't dup stdottf : 
optnfSTSZN. ■cfcCCnULS*] )) dii *Caa*c dup scdia*: 
txtc $vbats_3iv II die 'On't «xsc POST: Si*; 
I alM ( 

dU 'Can t tcrk: 



Si • 0; 

for Surl ikwft Virls) ( 

local (S:; • 4CAraihold($url): 
next if :$t -U; 

printf smR?. 'CW.SRV: Surl thr«atold $t\n* if tSopcionil 'verboao')) : 

next unlets » Surl — /"litip:/»; 

next if isurl •> sti'http: ; • akip scripts 

pr»tt:s&lsatLS •%sUitl.td««a«i\D*. ISl > 0» ? *&* : **. Si. Surll; 

» 

EiOSetSESDOLSl: 

•-ftiiei<cr:?»r,"s>' •: 

print tSTrs?,". ••—-!_*: if Scptionsl'dtbug') > I; 

next ;i ..'atrp:/: 

esop: 

(Surl. Saod. Svhcn. Srestt • splltl/\s*M: 
die 'fiAd ir^ut frcoi bacx an^* if Srtst nt **; 
fcnosucturitSurl). next if :dcfin«d SurUCSurlh 
next if Saod eq -N/A*: 

loca:iScacr.eC.sad. ScacJted.c&ecked. SnefaoUl • 

spli: : ! . . s=ae:Md.BedtiBes($urll l : 
Snccats t If soptisnst'igoorajieteta*): 
if .'Skc > £:ac.ncd.»d! ik iScaebed.^ !• 0) :l 

• srzc *s scA=nil.30d Swhen > Scached.ehecited) i - 

prir.i .rmrJ. •£?■"■. Suri ftciae Saod wften SwntnxaM if Scpcionsi vortooac' »: 
ivzmr. t £:*:r.c.cr.«CK*c if Scacbe.elMCxcd > Swnen: 
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$iun>($arl) • 'proxy-ctciiag Mtvtr': 
I tUiC (Saod > 0) ( 

prmiSTOBit -oW inio fcr $url\n*» if Sopticnjl dtbugM > 0; 

> 



t cTMCt pipt» to t»U to tht loc*l Uck «nd. 
pip«{CnUHLS. SDOTRLSI |i dli 'CKl't CfMCt pips: V". 
pip«(CnMrJl.TS. SDHaiSOLTSJ ll dlt 'CWt «•«« PiP«: i:*: 
i2 (Spid > Sorki ( 

ciOMlcnUILS): 

elOM(SDIDMESOLTS) : 
) aUif (dctintd Spi6l ( 

C10MISENSUU.S}; 

CXOStlCtTHESULTS) : 

0P«(STCCJT. •USDOmtSULTS'J l{ (lit -CK't SCdOTf: 

opttJSTDu:. •«4cnwas*} w a* •c«n t tcdir.* 
•xtc $proctM^ttii* 1! 4ie -Car- c wtc $proc«»»..:r 1$ . 
) list ( 

di« *Cui't fork: S;*: 

) 

tox Surl iktys lurlt) ( 

local • iLhruholddurn: 

'5^ -1» I . . ^ ... 

prxnt 'Skipping ignortd URL Juri :f iscptionU dc&ug n; 

naxt: 

print nOEM 'Off: Surl chrtthoW Sf.r,- if iScptiwut vtrtw)); 
nwc . iMKSuzl — rhtipn \\ Surl — /'tiUu>^ 
not if iSttrl - BJ-http: .n?!) ; t tkip scripts 

lut if iSoptioRi 1») »• 0 U ♦.Siirli_prcc«*««d > Soptxon»|aM.uil»H; 

$l»it.cii»ck«<J • $«>d - $vuit_tiM»ISttrll; Sctcwd.cs • 0: 
i{ isurl — /*http:/) ( 

if (dafijstd $cact«d.«odciMt<Surl)l ( 

fSc«chtd_K>d. icaehad.cbacktd. Sootou) • 
split Scsc>wd.sodtiass(Surlh: 
if <$cschod_cb«ck«d » Slsst.cfisekad) ( 
SUst.chsck«d • Scae)Md.ci»ck«d: 
StCtfplSurll • *prmetts rwi': 

if llcachad.Bod > 01 ( 

if (Scscb«d.K)d > Svisic_tiMi<Sttrl) U 
itiM • Scaciiad_9Dd <> 
SoptioosCCA.obsoitct.tAitsliold'll U 
!SoptioAi('want_iattit')l ( 
pumtotv.urlt. iurll : 
Stttli eiM($url) • ScacbaOott: 

print'noDUi 'iwl kDOW to bt ntw: iUppiagXn* if iJopOflMt 
if iSepuoMl'dabug'}) ( 

localtSidAtti > 4ctlM(luxU.tiM(tBrl}); 

locaiisvtouh ^ . 

IvdAM • fcetiMtWisit.ttfU(littl)) If SvUic.clMSfSttrU: 

Svdftct - 'Itavw viiitsd * ualut lvi«ic.ciaM(ittxl): 

chop Sadatt: 
ehop Svdau: 

print StDOUi »\tviiitad Svdatt ($vialt.tlaas<$url)l ■odifitd SodauUk* 

1 

n»t: 

) 

${Bed • scac)iad_aud; 
) tlsif :ScacAad_aod < fi) ( 

ScacMd.cs • •Scaehtd.vod: 

) 

I 

if tSnobets t<; ..:aot»*l 1 

SJtllt'sinw'-slI^i^inow not to aiicw robots; skippingxo- i! iSopcionsi vertost jl ; 
Mxt:' 

{rrint STDCM 'UW. Sml visited Svisu_ts3tBt$urll Md $wd\n' if tSeptionsl dataig-n . 
f ttor« ilva max icr latar - ustd for kiuviftQ whtn to add n^liaais 
Surls.chacfcadlSuri) • Slast.chacktd: 

'^^^i^fT^Ji -SsInSTaiald S.r: ..;t %d tsirsshold %d lS.t«n>HuniI!U-. 
tiM - Slast.cbtcate. !$opiicnsrv«tDo»o' 1>; 

it tsaod > sviaic.tiMa(Surir ; 
pusn I #nav.url s . Surl . : 
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1 ftlM ( 

pusbl9fkipp*d.urls. Suril: 

SufU.ciMtSuzll - Saed i( Saod > 0: 
fiMC: 

1 

prwc sronw •CUtciiinfl Surl\D* if UepticiarvtrboM J); 
•Mxc if iSopcionsCnop'll: 

• nMd le cbangt this :o handlt oussinc Silu. dovn hosts, ttc. -- bt 
I soarttr Atxnit return etadss. 

loc*liSs«ndc8d> • tprincflMs %d ld\n*. Sisrl. Sltst.dMClud. 

(Scacbtd.Bod > 0) ? SeM:b«d.nod : 0): 

print SOmUKLS Ssmdad: 

princt STDOm Ss«Hkad' if SoptiOftsCdthig ) > 0; 
r tltt { • CiU: 

lxal(SCilt> - Suzl. 
Sfiie - s?*fiU:/»?/?J 
SfUt «- s?-lilt:?$DlvrKW)/?: 
SfiU — s?"/locaii»st??; 
:! (-« SCilt) f 

loc«H»»ry» » »tstt«>; 

Stwtt • SarytSST.imMEl : 

Surli.tiftttSurll • $ti»t. 

Ssc«flpl$uci} » •loc»i fiitsysi«B'. 

prifttlSTDCW. Til* Sfilt ixisif -» $ta»t\r.'! if i $ options I'Stbog' J 
if iSciMt > Svisit_ti»«s($urni \ 

piUhltfMw.ticis. Suxll: 
} aU* ( 

pusbl8otd.-jrls. Surl): 

} 

) •lie [ 

pttSbCSinscccssiblt.urU. Suxl); 

prxntfSTDOUt Tiit Stilt do«s iwc Mistxn'J if (Sopcxons('d«t./}»: 
But: 

) 



I 

-iostlSENDURLS) ; 



while (ccnUCSOLTSX { 

print<STBn« •« — $_') if Soptionil 'iWfcug ) > 0: 
c!lop: 

tSurl. Saod. Ssutus. Srtsp. $r«sti • «plUi/\»*/). 

di« *B»d input trcn teck tnd* if Srut m It Ssutvs •q 

inMucburKltirlt. auc if tdtfiMd SurU($urll: 

if iSstatus tq *MOBOTS*I t 

push iffRobocs.Sttfl): ^ 
Scschcd.aodcmstSurl) > joiiKS:. $urls_tisttl$uxl|. Svisit.eiMtlSotl). 'notets'); 
nuc: 

i 

i! fSstatus cq 'HDKM ( 

SstatttStSurl) • Srtsp if Snip nt 
puahfliaaectssiblft.uzis. Soxl); 
• don' t cbtck again until tiaaeut 
t $i»ia.eb«clMdlSurl) - tim; 

%CMKi»CjKoAXMBi%m) • joinlS;. $wU.ti»«ISuzll. ti«»: 
noie: 

i 

if iSstatus na 'OX') I 

print! (STDO» -wanun?. bad status $sutus froa back cad. url SurlXbM; 

) 

if tSoDd > Ot ( 

local ischaagad): 

:t tSMd <« $visit_ciaas<Suill} ( 

pusbtlold.urU, Suril: 

Schangad > 0: 
i alst { 

pushiVMv.urls. Surl!: 

Scoangad • 1: 

i 

Surls.tiaalSurl) • Saod: 

Scacfttd.BOdtM>asl$ttrU • joints ;. $«>d. tiaal: 

prist! tSTOBUl -OK: Surl (SaodJ Isehaogadxr.- . Schangtd ? ":'m'l 

if tScpCionsi'varoosa)): 
SstaeptSurU « -poU*d*: 
I aisif tSBod « 01 1 

;rintf iSTCOUt •CbeccswiSurll : Scachtd.es -> ld\D*. -Saod) 

if csopticrjldahttg')); 
5cac«c_aodtiaas(Surii • ?oin(S:. -Snav.cs. taa>: 
i: *Scachad.cs Sntw.csi t 

pusbitoaw.cbtsuBi. Scrl): 
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pushltold.ctiksuBS. Surl): 

I 

) i 

print 5TDEM *NOK: Surl rtsp SrMpVR* it sScpcionsrvtrdOM' )! : 
pu»bilu-.accMsiblt.urls. Sutl): 

I 



ttxit :! !$opcxoas|'nop'|): 

priJlt OJT •<htta> 

it lSeptions('di«pUy.budtr'n ( 

:&ep(Sd4it * 4ctw»tciMM: 

print CUT «ior: 
<u:l«>Siitus s! U]tu</ticU> 

<hl>Tfcis page gtiwraccd by hrcf«'Stf3ntMr.ur:*>v3:;twtr vSwj{i«tftr.vtr<.u> cn $eatt</b(> 

2CF 

\ 

print OUT idiipl*y.url_ligc: 
If ilopiiorvii dispUy.tooMr*)! ( 
print our «tOT: 

«ar> 

7.111 pA9t gtntriCtd Dy <« hrcf"'Swin«Mr.url*>vin«vi; vSw3fMwtr_vcr</a> 
i pregritt »rii:en ov SwjncMr.mt.'ur «nd modi f ice =v* £»in«wfr_ftu:»rZ. 

) 

priRt our '*ihxal>' .'\n*: 
^ICMiOtfri: 



••i«vt_eh«c)uum if iSopcianst 'chackiW M : 

'mi «Soptions( eachft.Mdtiata*)! ( 

4vriti.cKiMd.notftiaasl$opcions( ' cm.Cilt' i . %cacaed.aodtiMsi : 



txxt: 

sub url.duc ( 

l0C«i ISttTl) • 

12 ISurlttSttrU) { 
mum Surlt(Siifl): 

I tUt I 

rttttzn Suxl: 



Sit oydtsc ( 

fcurl.dcsc(S«»<>>4uri.dft«elSb) ; 



lub ditplay_uzl_by_MBU ( 
local iturlliiti • 

local (Surl . Sold^^ch. Saontb. Swncb.Ro. Slite . S_. Sday, Swday . Swdaynua. Sbold) : 
lociKSvdact. Stourco): 
UUt • 
SMntb • "r 
for Surl (furlllit) ( 
Sold.ftoncn > Soonth: 

ISday.SBQct&.no.Sytar.Sndaynua) • (Qauati$ttrls_UBtl$url))M3.4»$.S) : 
Saoath • $mcbitSMnchjio); 
Swday > SdiysISMdaynw}: 
it (Somtb M $old_paiitb) { 

SlUt > ■<^ul>\a* if (Sold^MOth): 

$luc - *<M>saoiitli Sy«tr</b4>>n*: 

Slist » •<ul>\n': 

) 

Slist • •<ii^ <nobr> <• hrtf"\*$url\'>* ; 

Sli»t fcurl.dMCtSurl); 

Sliit » '</•> I': 

Sbold " rsoptions( 'flteldtn* I 4t 

iSurU.tiMlSurl) > $visit.tiMt($uxlH 
l(visit.tiMS(ittri) > 01): 
(SopcionsClait.viiictd ) it Svttit.tiBtilSurl) > 0) { 
cAop(SvdJt« ■ fcctiMtSvisit.timslSarlD): 
Svdatt • *. visicod Svdato*; 

I tlM I 

swdatt ■ 



:f iSoptionsi tiMJtaap.sourec- 1 &i SstuplSurl) oc *" ! 
SMurci • *. bMod on Siuspl Surl I ' : 

*• tlM ( 
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1 

Hist •<sniO«>' it Sbold; 
Slist » 'SKtUy. SBonth $d>y. 
SliSi .« *<^ST110KC>' if SbOld; 
$list • $vd*tt; 
Siist * Ssourct: 



Slisc 
Slisc 

) 

Slist 



iftMpsbotCSuzll a Soptiensf 
&snapshQt(SttzII : 
• *</noDr>M)*: 



locil ISurlJ • 9_; 
rtturr. " :f Sur: .'http:/: 
if :Sopcicnsl 'loras' I) { 
rttum ij/z' 

<iszn Mthod»PCST »ccAon«-$optiOtts( inaf shot.url* ) 

<Mltct a«B«»''5yp»*> 

sopcion.scrisg 
<.t«ltcc> 

«*for»> 
c. tonc> 

I 



sot (Usplty.url.lisc { 

loctlcsiisti « tbi» pavt 9ta«r«cid bjr wJMwtr SwjBcwtr.vtr cy • 

.'Brooks Cucttr ( SM3BtiMZ.attttor.mil 1 — >\n'; 
if ftn«w_utU» ( 

Slut * *<h2>T)it Colloviag URU hm chiB9id:«/h2>\n*: 

Sim .- 4diipi«y.url.to.aonihirtvirw sore I $uili.tiM($j|<«»Siris.:;st{$bi i •nw.urHi. 

J 

if tSoptionsCchtcksua*)! I 

If t«n*w.cWtiuMl^l^^ following C8U My hivt ctungid (ao Bodificauon iiat miUblii:*/h3»«»«ul»VB*; 
foraicb Suzl (tocv.chknas) ( 
%dm%c • fcurl.duclSuri); 

ilxiz qq?<U><ttobr> <• lir«(«>Sttrr*Sdt9C</«>Va?: 
suit .» AsnapsbociSurl) if SoptionsrwiptbeC I; 
Sli«c •</BOtoo*; 

I 

SUst •</ttl>\n*i 

) 

if tlold.chk*u«»^<^ foUo-infl UItU don-t ...d to li«y* ciungtd mo Mdifiucien t«t mll.blt»:</M.«i«-l>M»- 
(arcAch Surl (Oold.chkMMl ( 
Sd«sc • 4url_dtsclSurl): 

Sliic .» qq?<Uxaote» <» hr«f»*$itfi*>$do»c</»>\n?: 
Sim uaapsbotlSurl) if Sopuonst snaptlaocM: 
Sliit •</nohr>'; 

i 

Slis: •</ttl>\n'.' 

if iSopcicMl dispUy.rocaaf) U tikipptd.urli) { ^ , , . , 

$li$t .. •<riJ»Th« iolltnfing OWj wt cbocntd rtcantJy mfi net psi.od:<.i2>Mi«Ml»\R . 
for Suri »*ort bydwc •»Jupp«J.arii» ( 
Sdssc > hurI.d«tc(SuzI»: 

Slxit • qq?<ii>*iiote> <* hrtf«*$url»>$d*»c</«>?: 
if (SoptioasCciMitap.teurct*) U Sttaa^tSuzll at *M ( 
LocAl " * SurU.cboektdlSorll: 

it < 01 ( 

if (SopticnsCvtttest'U \ 
local (Sf 00. Sfcol. Sfoo2»: 
enop($too>fccciiDe(tim} ) : 
Stool - Sfoo: 

ctMpiSfoo*betiMUurls.ehccKt±:S.:ri:i ■ : 
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Sf002 • StOO: 

««re appartnily chtcud in tat tutur**: 
priPLtl5:00» Hiae li catcntd %f'.P.'. Sfool. SJ3c2»: 

S 

• Sift " 0: 

r 

;octl iset « uK>2stris»9tt: 

St •* S'^d^tS//: 

loc*Iilt90> • spriatSfIt If. 

(StetqplSurl: tq •Juteory* ? 

•vijitttf* : "stekti-. 

St! : 

$;;st . fi»»t<» in SsttBplScr:^ S*90 tflor 

. ifBtpitettSuti: ;1 Saptiowi ir.ipifcc:-l: 
■■ •«.'noBT»\c'; 

;t S:rt::r.s: iis?:»y.unJBtcwr.-) ii «ittcztiiit.»."ls' { 

$ ^- . •<h}»UnL>li to tectss ti» foUtwir.? ORL »:*/W»m«ul>»n- 
!:r sur: ncri l5utlslSt»«»>Suri$(Sb)) *iMCttMiolt.url»i t 
Sdtse • 44rl_4tiCtSutl»; 

Slii: « ^><U><ncbr> <* hrtt-'Sur ■>$tftsc«.'t>\B?: 
;i :$uils.tcc«i»t<JIJurl) > 0> { 

cacpiSvMit « kCtiatt$urls.ACCtiM4(Surl)JI; 

SU«: • MUst viiittd Svmi) ' 

llMt ktMp»hotl$\>rl» i£ SopiiOMi tMpiboi'l: 
SUst .• •«/iwbr>*: 

:dgUn»d4S»t«tu»($urX))l ( x.* 
IpctHSwhyJ » s«wwtrTorH»tplitf»«?i(S»t«ttt{$arll): • : 
if IdtfintdtSvhyn I 

Slx»t .« ■<m.>«U>$why</UL>\B": 

) 

\ 

Sim * *</ul>xn*: 

I 

-crO*!!*. folio-ir^ U»U trt not KCttliblt to wbou:./hl>«»««l>vi»-; 
torttcb S;::l (Vnoboti) ( 
Sdiic • iurl.dt*c!4uri»: 

Sim <iq?«li><no&r> «i hrif«'$url*>S^«e*^»>V«: 
Sim - fctntpiliotiSuilJ it $optionii'«n*p»tofh 
Slut '</ftMr>*: 

I 

Sim •• '</ttl>\B*; 
rttuxn(IUit); 

• 4 9entr:c routXRt ic figurt out whit kind of hotlisi 

f potintitliy Mtful if tto hoclUt U ttd a» througft ttdia or •OMtfcuw. 

tub Lotd^hotUst ( 

LoeailSfiit, -MtlUt.url*. •tootUtt_dttt«» • 

;oalt%etsct.%tlMS.$.): 

:ptDl3l.$filt: II dit -car/t optn filt S'ilt tor rttdinj : $!\n': 
$. • <!»>: 

Uo.dj««ic.aotim.or.MstorvilK.i. -Mtiist d»tt». -botUu.LtUi. 
I tuif ';<!aocTm iicaiiscTSWE>-Boote»rk-fiit-i>/i { 

lXo»d.aiucapt.ooowimiIH. •hotlm.datti. •botim.tawi : 

* '^dit •Con-: know hov to hwdU hetliit witU Joraw S.': 

£lOMC3:i: 

ff^ :;«d_ccii;:..-=tli«t.cr.htitor» • 

;:c»iiSJ:ir.i:i.SkPtii5t.*f^i. 'inn- • »_. 
.T=»i!s. :.r:.sdt*c.5tiOi!. 
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t discard 'Oteiuit/Clobar 

su:". • SI: 
■•-:=• • $*. 

::i 'Cir.*! ?«r»t input liM J. : 

ii«i:»Uur';> • Sdtic: 
f 

i-:rii .$:r: ! • iloc»ld«t«o»tiBt»StxMi : 



J — . *£rj.-:dit»; • "»t liU 

»j«ri«lv.-: • liTJii . :.3int • Ml- . •coDttntI ) ; 

* • <a\t . -artf v»' • > - •>$/!) I » 
I'izl • S:. SdiK • ** : 

StlM • SI: 
- tUt t 

Stiat • 0: 



.rule i<j«fintd ^ , 

Sdtic SlinkafSliakftl: 
-riK SIDW •t.tr.ci.lii*. round UKL sun t« $t«««- i£ «lapcionH-dthi»-) > SI; 
I Surli • SdMC; 



i 

f S;ttc !:r ;:»tsr/ SiXts. 

=ptnir;.i:iltJ II ii. 'Cw t op«i tilt $mt tor toadmg : $!>n-: 

pr:r.t *Lc*lis? hiixory liU 

$ • <2i>: 

A t/r.5iJ-.-aBi«c-hi«toty-toiMt-l/^ i 
el;.; '-MCCii-Clcttl-hmory-filt-l/) t 
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loc*l:S..;url.seiMl: 
vhiU .<StvandXt» I 

Surl • SI: 
scut > S2: 

dift *C»n-t pir»« input Um 5. : 

) 

5r;ats($arll • StiM: 
rturr. ^tuts: 



SUP txtrs:t.:irJu.dtse t 

l5C» : t S : irJts . i i : aiu . II inki I ; 

Loc*i .art-..$-::. :,.5d«ic.lh«t(itr«.$conttnz.$rt$pcniti . 

''T^^*U«Ta..crCt?-. ••:n..r:. -h..**.". '.cnttnt. Sgl.r«^«.t.t«««c. : f 

•rrst STS&A conttnt'— — *R : 

?rS: STOM $coiie«iit: 

;rir.t S?!ffiS« 'xn- - — «id conttnt tn*; 

} 

tip«»f ! • l;nXj , $conctni > ; 
$. • $Ju«»l$liJA»>J 

Sttrl > SI: SdiSC « ": 

Sdttc SliiilLStUinksh 

irmt STOWH •«tricc_liaiu found UXL $url\i»' if l$flptioM(»ditaig-l > «: 
oubffrtt.Suxi.SdaMl : 

prm OTEW -wtncOinkt found XW sun- if i$c|>tioiul -dtbuflM » Sh 
puibiftttt.Sl*SJ; 

* '^lai $tW» •mrist.liiA. ignoring $.mi- if tSopuoni* 'dtbugM > 5): 
MXl: 

) 

i 

ztciiniilrtti: 

) 

sub displ ly.htol ( 

lociilSin.urU • fixifttfj: 

print srom •diipUy.hcallSiA.urllXn" £f ISeptionit i 
local I Jl:njti.41iii)t».%llnk»» : 
ioc*l ( » t tt . Sur 1 . S_ . Sd«*c . IbMdixi . Sconttnt . Sfiipoiat I : 
Ioc»H Sday. Saonih.no. $y««. SirtiynaaJ : 

sr«Q»ni. - v^'lrtqueicrCET'. •in.url. •h.«J.r.. -conttnt. Sgi.fq««t.t«out.: r 

4p*rit.£tsil t • 1 inU . Sconttnt i : 
SlirJiS " -I: 

rtiU id.iir.ri ilinksl-slinksll t i 
•• 'jlinxslSlinUi — /*<\i'»#Jn«wr\s»url«" ' .♦l*\f >J/li I 
iscuaj • ifcww stittSl.SUstrAgtnt.SRfpiyitJlUll: •* 
unitts *$tiM> 1 puiht»rtt. -uateown-i: nwt: » 
iSd»y.$Bonui.no,$y«»r.Swd«ynt»i • tgatiMi$tiBtiK3.*.5.6i: 
laenth - SwnthslsnacH.ool : 
Swtoy • SdiyilSiedirynwl: 
puibi*rtt.*lwd*y. Way Sy««r»i: 

> •!» ( . . 

juihi >m . Slink* I Slift^l » : 



Tsriri irs: 
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sub reii.cKWS.nodiiMS ( 

loc*l • ^ ^ 

loc»l f*tiat«. Suri. $«od. Schecked. Snohoui: 
op«niCM3CD.$filti II fVtxn n: 

whiU t«C«aED>*) ( . . . .» 

dit "Eld input ttom Sfilf if $r«tl a« " 
SMMt($ttrll ■ lOiniS:. Sosd. Scfttextd. SsaooMi: 
StcaaptSurll • 'prwiou* roB": 

I 

cloMlCKMEO): 
ncum UxMS: 



sut vriit_cacsad.MdtiMS I 
icci: :SfiU.leacBadt • 
loctl iSKd. ScliteMd. $kty. Snotctii; 
rturr. li Itngui ii«y» %c*chtd 0; 

:roptn(CAO{E;.->Sfile'!! . 'UnabU vri:« fiie $fiU : rttuml. 
iortasb Skey (iieys %c»chedJ « 

iSael. scrwcKti. Snoboti; • split :S.. icac:-«3lSMyi » 
?tiat::CACHID •»s\t%d\tW-.:%a\r.'. $key. s:.-tckid. Sooe. Snobocsi; 

j 

closaiCASSD); 



locUdatfteatiMi) is « a»di<ic»ticB o£ th« gtt jataMfraa wM^ttt.pl 



TruuUtt • tUtt string to okiehiM ttam (Mcsnds siact Qpoeiki 

Usage: 

Sntim > &locftidAC«canciM(Sdatsi 
iriwrs Sdstt in » tbt fellowing «oi»»t: 
"nai rtb 3 ;i:Oi:&s wr 

sub locsidsittootiM 
loesHS.) • 

l^HlSdly! Im. $yx. Shr. Sain. $«c. $«dsM. Satis*. Sam. SaiAcI: 
locsKSotUatl • 0: 

I Split dst« string 
local Itwi • split: 

iSwMkday. Son. Sday. Satm. Syr I • 
Syr 1900; 

iShr. saun. 5s«J • spUti/:.'. Sstiaai: 

it i!Sm II iSyr « 70>J l mnm 0: ) 

t Ttanslata smth nan to wad»t 
Saidx « indtilSMstr. substr(Ss«i.0.3t): 

:f tSoadx < Ol I rttuni 0: j 

•1st 1 Snion ■ Saidx / 3: } 

t Translatf to tteonda since Epoch 

rttura utwlocaKSsac. Sam. iftr. Sday. Sscr.. $yri»; 



t cor 
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• i/uir/Xoc«l/bin/P«i 

• Thii script i» A btektnd to the uupfhot script in my CCl dirtctory 

I *i «S^»«d pruurily for purpotf of dobugginQ: it down- 't rily 

• ^ havinH CCX-tnvirowenc «t up and iasttid i«r« )oic lach 
I ccnund-liM ar^s* 

• ooe of Che thingi boin9 done for .iisplicicy h«t is to just totp 
. « the log of <ur:.ii*eiW> tupies. lasttid of 

t ovirvciting in pl«e«- 

i to do: 

t lockiog 

SdebuQ'l; 

SoetdLiocks ■ 1; ^ ^ 

$)ttxling--h - 103* * 512* ' 512 Kbytes 

SsMPShst « •htCp://»*'Xl26.reti»rch.att.coa/c7:.-b.=/cc_..»aas . 

t this tnould be PWD bat t.iat wy not bt nor««iixed w /hoM. 

Sinapdir « •/ bowtidoviglif/ tap/ cgitest/ snapshot •: 

' lcc4l(SoXdfhi « seiectJSTPOKS : 
Si > I: 

stifCttSTDOOT). S|« I; 
stlecilSoldShl: 



unsbif lilSC. vhoM/doufllis/lib/perlM: 

t archweciure-specifi: librwy (for fy»^socit«.?h) 

require •ercb.pl": 

iMViSUbloc = •/hone/douylismch/Sarch/lib/perlM) I 
unshiftdZNC. Slibioc): 

) 

'sro.CLOOEC. sr.snn)) « C. 21 ; • fro* >ol*ris syt/fcntl.h 

$lib^ww_ver - 'O.W; 

'^^!u£i«.PBlL'» 11 •/ho.e/dougUs/lib/perl/Ub—- Ptrl.Slib.%««.>rer-. 

\i 

Sroot « SDJVriWM; 



if tdelined SDWCPAWn I . w. 

I print STDOa 'Path: SDJVrPATK' »<br>' if Sdubug; 

' •iJllil'WH-) « whc«e/do«gli./«ch/$«ch/binwho~/douyli./bin:/bin/:/u.r/.binw«.r^^^ 
J 

unless lSOIV<*httpjtfoxyM> ( 

t SOWCaojroxy)='»tt.cca*; 

t want reseereb.att.eoa to 90 outside ^113* ,-.-irch.atc.coa.ih.ett.co».ho.»tt.cca.cfc.»tt.cca.at.*t 

SDWt • 00 jroxyM«'tbo.att.co». ncr.coa. radisn. research. ett.con.wwwll2ft.reseircn.aw.c«.*a 

^com.divUl.atc.ecB*: ^ .aiia*. 

SEMVI 'httpjTOxy- }.*http://radi8h.res«arch.att,coB:8000/ , 
• snivi'httpjroxy ).*http://blttel.reaaarct.act.co«:iOSO/ : 

1 

require 'cgi-lib-pl*: 

SIXIT.rN = •cleanup_t»p* : 

require 'cgi-exit-pl* : 

require *c9i-aiarm.pl*: 

require *daple]9ipa-pX*: 

require *nonuilize.bcnl.pl': 

require 'ctiae-pr: 

require 'scat. pi*: 

require •w»*.pl*; 

require 'loclc.pl- i! Sneed.locks; 

irequire -flush. pi': 

require 'print id.pl': 

it tSdebug) i 

Sarqscrm; » joir.!' ■ ««CV1 ; 
print STSEW tprincid: 
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t«it.?ractfuUyt -unable to ocace htaldx^ . ^^/t/^I^J^^/s^^iJ^, 

•lexu^racetullyt -unable w locaw fiUer.anchors . 01 
• :t 'defined S filter .ancbert; 
Srctdir * Stcidiff: 
Srcidzr *- s!/resdifS$i!: 

Sci=*Srcidir/ci'; 
Sec* 'Srctdir /CO*; 

!ri^^??^^:w:-ti^r«..rch.....c^ 

sor 

cnop t S r.r .r^ndi.auihor i r 

Shanlciii.author » «EOr; R.n«/A> 
<A K?X:=T.-.ip:/.— -ipr.ih.att.coa/-tb*n/->TC3 B»il</*> 

EOF 

cftoplSktrliif f.authcrl ; 

• 4*xi-..5racifuilyr*<rong owr of arguMncf. 1) if i*M.Qt \^ 2: 

StraiXer « «Eor; 

</BOD¥> 

<lir> 

nr.J.-«iito:«oh.ndsX»tc«e«.r..«rcli..tt.e«.-.»— «.<,.» «. wpr.e».t«l. 

</KTML> 

^tttn Dy Sbuidiff.author:: 

ISoperaiioa. Suser. Sutl. Soitrvtrsionl » •ARCV; 
Subfiles s 0: 

t a«cc. array cf operations and reqtiired field*. u.>ORL. •-•aiUttitrl. 
1 w->wiite access. p->pepple 

Ullowacie.operations » rR«««b«f. 'utw, 

•Oiff, 'uew. 

•Status*, 'ue*. 

•Uiit^ll'. 'f. 

•Hittor>" , 'uf. 
•Users'. 

♦X11.OTU-. -p*. 
•oaL^ttiers-. 'up'. 
•Vxw.Currenf. 'ue* 

4exit 3ractfuny:-:n«9*l operation Soperation ipecified'. I! 

uless cifiaed saUowabit.operaciooslSoptration): 
• Pr.=t STDDU. .xSaao-abl..operation.(.operat»n,.V-SaUcwabie.op.rat.onsiSap.racxo^ if Sdebu,: 

11 ;Sal-.r.abif.cperationslSoperacicn» - 'e.M 1 ..xw.|-S/: 

iexi-..;racefuilyflll«9*i 

Jtists » i*people/Suser*I: 
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•« ;SdUcw«bie operAtionfl$op«r«ion) /u/» ( 

iexit gracefully liprincliqqSXllt^al URL: Surl;. li 

.t Suri I- »!-http://l\w«.-.A-:W$l; 
< '.t:; uriiil add i090in9 

{Scperition eq •Viiw.Curienf) t 

pria: "Locition: Surl\a\D*; 

«xir; 

Sstec_rei « li 
• tilt • 

Sottd.rct « 0; 

li :5a:Xc«cle_optr*cion»|$optmiony /p/) i . ., «. 

3ptrxir(PB0PLE. 'pwjpltM ll 4ejcit jract£ullyCOn*blt w opta dirtctsr/ -ptopAf. 0): 
local («ptapAt» s rttddiriPtOPLCI: 
i::at» « grtp(/*U-xl/ « »'-!ptopit/!. iptopUi: 
?r;nc! STDERR •Chtckicg uitrt %8\n*. join;' *. •ciaesi if SdtbtiQ > ^: 
:lotti?COFUl; 



i;r._u:l » fcurl2tnl$urU ; 
Srciiile » fcrcsfiltlSln.urXl; 

if SrcsJiXt) { , , * ,j • « 

print* STOEM '%» prtviously chtckta in\D'. SurX it SdtnuQ > - 
• 4«xit jrACtfullyCrilt Sresfilt aot wri«blf . 3! unltst 
StJCitttd • 1: 

• tlM 1 

Stxiittd > 0: 

» 

iustrs : :i It Sopcrtcioa eg *muus«rt*: 

if ''Soptracion tq '01ft' U Soitrvtriion !- /\d*/) II 
Scptracion tq 'Uit>ll' tl ,, 
Scptracion tq •Kaitory* ll Soptration - /.-uatra/i j| 
Soptratioo tq •Rt»«btr_i£.ntW 1 1 Soptratias tq •AXl.URU J I 
fortaca Sciata (SciMtl { 

if (SoptraeloB at *U1IL_ttMrtM t 
undtf SlasuMBi 
undsf %Xaacit«n: 



(•r Stxatt) { 

htxit gractfullyCfilt Srooi/Stiatt not writablf. 01 if ! -w 'StiMi*; 
optn«T»ES. •<$t»ti'J II fctxitjraetfttlXyCttabit tp optn Stoot/Stwti: 5; , 01; 
unXtai t!$Dttd.lock» || UXock<Tl«S, III ( 

.«.t^.c.lullyr«»W. ^•J^J^JJJ Sl^i;. Pi.«t try again lattr. lYou can r.Xoad thi. paflt.r 



) 

vhilt t<Tiias>) ( 
ntxt if r%f: 

localtSturl. Iwhtal - split I' *. %Jt 

if <Soptr«tien tq •Biff 1 1 Soptration tq 'Kiitory 1 1 

Soptration tq •R««btr.i£_niw \\ Soptration tq •URL.uatrtM t 
if iSturX tq Surl) ( 

if I Soptration tq 'ORL.uatrf) ( 
XocaKSustrl « StiMt: 
Suatr s!*p«opX«/!t: 
SXaatattnlSustr) = •Owbta*: 

) titt ( . , . 

Sprtvsttn > Slaststtn if Suitrvtrtion tq *ptoultimatt ; 
SXaatsttn « 'Mtn*: 
SLasuttn « join (* •»dital: 

) 

) tXsif (Soptration tq •Ust>ll- 11 Soptration tq "Uatrf 
II Soptration tq •AIX.URUM I 
SlaitattaiSturll ■ 'Owhtn'; 
Sla«tttto($turXl = joinr Owbtnl: 
if (Soptration tq -Xll.URLt'l I 
I Mt htrt: uat if stt juit Ix 
SXastuatrlSturll * Sciats; 
SXaituatriSturl) s : *ptopXt/ ! ! : 

I 
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SiuctMn . Sprtvtttn il defintd Sprtvtttn fc£ Suttrx-triion eq -penuiurJte*: 
eiost (TIMES): 

fcunlocktTIKES) if SneedUocki; 
li (Sop«ratioo eq •'Jstrf'J ( 

loc*l {9kty») • "y» %lMtf«en; 

local ($u«erl * Sti««f: 

Suter ■- s!-p«opl«/r'.: 

ScouctlSoieri * SlUyi • I; 
) elsif ISoptratisn eq 'Ail.ORLi'J ( 

for tkcy* %lutsetnl { 
Scouu .' 

) 

; eiiif ($cp«x»cion eq •'Jstri'I t 

print 'SotniRg ft" «ver bten chtcKio xa.': 
price Strailtr: 
cxxc C: 

:f iSoperacion cq "Uat.AllM { 

prim tTiiiei-Pagei secsrded for $ui«rV' : 
prir.1 •<n4>S0fted fay Doa»ia<#h%><p>\r.': 

print "tAaU* <TR> <TK>U?U/TH>«TH>Kccificaticr. tiBe»tasp«/TK><;T«»xr.v 
t«crt tydoaam key* *lait$een> { 



II. s;astseen($.}: 

priat •<CABU>'. StriiUr: 
•xu 0: 

) •lsx£ ISoptration tq •Uieri'l t 

print tTitleCMiri of the NO MAHDS service' >; 

print •<TABL£> <?R> <TH>0Mr</TH><TH>PA9ef Tecorded</TH></TR>vn : 

} 

print •</TA8U>*. S trailer; 
exit 0: 

} elsif tSoperatioD eq •URl.uiarf) t 

print kTitlerUieri ww nave recordad Sotl I; 
I print SI090: 

tor tsort Keys lUitteen) I *,.n^k>i.^c -.1* ift>i<Ba>\ii' Ssnaoshet. S . SUscteeniSJ: 

print* qqi<A HREr.**i?email«%«4type«U»t*%20All «%»»<B»>m:.. ssnapino.. 

1 

print StraUer; 
exit 0: 

) «l«i£ {SoperAtxcD eq 'Xll.ORU') t 

print WitleCPayes recorded by tHe MO HMIDS •ervice »; 

* Jr"t m> <TH>URL</TH><TK>y.erls><m><TB>Last varai«ic/THxm>XB-; 

i:Stt^*<;S::SrA ^.nSSl.i...«iU%«cype.vi.^%20Current->l.<;x,<,^^^^^^ 

Sanaps&ot. Stater, 

'pr^Vk!%d <lim-%.7url.%.4type.UIlLl%20«..r.*.«il.l,.>um.<.A>:. 
ScountlSJ. Sinapahot. Suaer: 

* *^"iit£ qq!«A HREr»-%«?type«Lift%%20JaUa»ail«%a->%»<'A>! • 
Ssnapahot. SlaituaeriJ.). SlastusariS.); 

prist •</'ro>*: 

locallSfni » 4rcsfiUt4urI2tD($.)) : 

'"•:SirRCSriu:NL'?'|l «m .C.=-. t ^ V,,-. l«t ,.t«r«c=: 
• do w* need to lock just to read lat line? 
locaKSline. » <RCSnLE>; 

't isiine «- ;'head\s»l\d*\,\d»)/) I ^ . 

pii^!: ^i<T0><AHR£F.-%.?urU%a4a»ail.l*4tYpe.Hi.tcry>l.<.^></TD>xn^ 

Suapsbot. $_. Suser. $1; 
close IRCSnU): 

> 

print •</m>\n': 

prtct •</tA81X>\n$trailer": 
exit 0; 

■ eii;f 'scperation eq •Rtsemer.if.neWJ : 
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:! •.ctii.r.cdiSlascittn)) ( 
&e$9 s «SOF; 
TiJis «lt i5vtr«*Surl'>pa9e</A> was previously wvw. 

«A l«Er«*S»aap$nct?utl=5urU«MilsSu$ericyp«»Hi»tcry>Vit« the hiftory</A>. 
SOF 

prir.i Sxtg.St.ra4.ltr: 
exi: 

» eisii .:ce:i3eiCSlAStseen» kk Sop«r»tion 'Cif!'; ( 
Sr.ig a «£0r; 
rriis USL r.as nsc previously been saved by you. 
<UI,> 

<1I> <A H?.zr«-5«napsrxi?urlsSorl4eMiisSui«rfctype«RtMniD«r'>Takt • siiApsboc</A> 
<Li* <A H?rr«*turl'>VAew current version<iA> 

<LI> <A ■5£rs"Ssr.»p«cct^«i«5urHtm»il»$u*tr4typt«Hii:ery*vitfc» :he hiicory</A> 

</*JL> 

£0F 

fcexit.sraceiullyfuser tr:ed co acctfs a UM. no: praviousiy »•«•, 
SBS9J : 

! eisil • •« Sr:s£iit» • , , 

iexi:.;ra=e:uily{'Tr.e RCS history asscemcd w;cr. the -ri <a hrei»x-$url'.*>S«rl*.a> ii un»va*latoU\ II; 



t ;o:a; •headers. 3csncer.:.Sresponat) ; 

it !Scpera--icr. Tif!' «! Separation «- /Renaabtr . • ' 1! Soperaiion eq 'Sistoryt • 
Scurrcr.'. t *;a;atf Sfl^url': 

• if i-t Scurrenti ? 

> ur.lir.A '.£:urrer.t: !| fcaxitjoraccfiUlyC maolc tc ualink Sairranc. 0): 

• } 

if (S0C«S.l0C)(SI t 

Slccafile » *locki/$£n.ttrl'; 

oper.tLOCR. •>$lockfile-! (| fctxicjgracafallyCCan' t lock Slocktila'. 01; 
pusnl0tBp:ilts, SLockfilal: 
unless (&slcck(LOa, 0)) ( 

iexii ;race:ullyr Unable to acquire lack on SlockfUe*. 

'A fila was toiporarily unavailable. Please try again later. iYou can rtload this page. 

prir.t STOEWl tprintid. •Aequirad lock on Slockfila\n" i» Sdabug: 

) 

ii {Soperaticn eq 'Oif!' tl Soperation =- /Beaanber.*/) { 
Salivapici < Lc^i.iceepalive; 

prir.: SrCERR fcprintid. •Ra«paliv« pid SalivapidXn* if Sdabug; 
Snawurl « Suzl: 

Sretpoaae • 4www'lraqveit('Gir', 'Qauarl. *btadtn. •cootaat); •' 

kill Salivepid; I SIGZBr 
il i;waicpid<Salivapid. 01) ( . 

priotf sncm *%a wazning: uoabla to mit for child katpalivt proccaa S«livfpid\£* . fcpriscid; 

\ 

i! tSreipoase rin \ 

localiSrcstt * Swwwarror'RatpHessagefSraapoaia): t*: 
if (defintdtSrasci) ( 
Srasc = •: Jrasf : 

) 

fcaxit^racafttllyrRaapaata Sraapoost during GET rtqumSraat . • . 1»: 

I 

local (Slen) * length i Scon ttot ) : 

:£ .Sler. > Snaxlengtb) ( i,.^^k • ^* 

iex;c_;racetully:''tie page is coo large to cache. Content length is Slco: naxiaue is soaxiangtn. . 11; 



;rir.tf: ST?*PJi 'Get worked !?\n cont«nc:\n%«. . .\n*. 

isucstitute.entities isiibstrlScontaot. 0. 256111 if Sdafaug > I: 
Scccten: = ir.ciaalizehtal($concent. Sna%»urli: 
:: iSccntacc " /^£rrcr fiUtriog mHL:/) ( 
bcxicgraecfullycScoatm. 0): 



oper-lCVRHCrr. •>Scurren:M || fcaxitjgracefullyi "Unable to open Scurrenf. 01: 
pus.-. !lt=? files, ScurrentI: 
?r:r.: Seontect: 

select 
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if •Ssurrtr.fJ i 

iexit.5rac«rully( 'Concent r.cc ccrrecciy copied tc Scooccnc, ItiiQC^ $l«n'. 0); 
iex::.;r«cefyll/ t*v 'Page is apparently eapcy*); 

) 



'Ss'is'. kpriKii. Is -I Scurrtnf if Sdibua; 
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-.£ l$operi:;cr. 'Ziit': i 

oper. •>5:urresr.fMpM II 

iftxit.;ri:t2ii:/ •Vnafcle w create .•5cjrrent.$aap\'*. 0); 
p-sn:Jzi:i;e5. •Srirrtsc.fW; 



il £^ier/crti:r. *- \a/l t 

i'.astscfir. = ••rSuierveriicn"; 
• elit . 

flAs-ftct- : 'i ilAitieen*. 

• tlasuee:. > *-t' 3Iaiciten: 

Sc=i t -iSpii « id--?:expipei'OUr. TS. -C-S. {'Sec-. •iMCseen. 'Irrsfile'i j : 

• » q:«p:: « ic-?lexpipe(»cu:. •£?JIS. ivhOBe/douglis/bin/itawc* . '-pv 'SrcsiiieM ) ! : 

prii:: snus. •S-zai-iatir-^ =: sand %»\n'. «2=d if Sdabug > 2j 
eval icsi: 

if :$S) >: 

tixi:.5raieiii:yj$». 01; 

} 

Mi^t :<XSI>i ; 
prist 311: 

pri=: srrsa •$... " if Sdtbug 9 2; 

) 

prirt STOSRJt ".r.' if Sdtbug > 2 « S. > 0; 
cloie IS; 

print rrasss ipruitiS. 'Error S? from ciost{lM)\n' if $? u Sdibuo: 
close CLD: 
close (CUT): 

prir.: STOSIR iprittid. "Error S? froa eioielOUTlVn* if S? 4( %mai9: 
i! (Sdeinig) * 

local (S first) > 1: 

print STCSWi 4priatid. •Mtsiagea frca co:\n*, uadef Sfiric if dtfieed Sfirit; 
prir.t S73SSR: 

I 

> 

cleie ICXRS); 

print stDERR fcprintid. 'Error S? fro» cIoi«IEWS)\d- if S? 4& Sdibug: 

if <iwaitpidtSpid, 0)) ( ..... 

priat STCEPJi (priBiid. •wmiog: unable to nait for child procttt $pid\D*: 
) elsii ( 

?ri3t StlSXft 4priacid. "$0: warning: co returned acacus S7\n*? 
) eisif :Sdebu6l ( 

jrir.t STCEM ipriatid. 'SO; co returned Olt itatuaVn'; 

J 

if t-: 'Scurrer.t.snap'l t 

::callSls = /cm/Is -1 'Scurrencsnap' ^■ 
iexlt.^rac^:uIly5Sls. •SnapsDOt of page is apparently CBpty'): 

\ 

if •Scurrest.ssap' " -s 'Scurrenf ii 

:iystec:'.-si3/c:p' . *-s*. 'Scurrent.ssap* , 'Scurrenf:! ( 

?r:nt 'Ss differences encouncercd<br>vr/ : 

Sciffs B 
I else I 

SdiffS * 1 

SAlivtpic * ic;i_<eep*iive; 

;:.r.i nzZrA fcpri-tis. •Keepalive pid SalivepidXn* if Sdebng: 
itrai:er »• ». -written by $no.hands_as:tr.or)\./SWhtnldiff .trailer./: 

;;sn ::-T. 'itzzltiii Scurrent.tnap Sr-rrent Scurrent.difff I") II 
"* ieii:_;rata;-ll/''0»aele co invcue r.aldtff. 11: 
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pusr.i«uiplilcs. *Scurr«nt.diff*): 
i$dibugl { 
l:e«l(Sfirstl < 1: 
vnilt{«DXFr>l I 

p»nc STDERR (priotid. *!tessa9M <res huldief:\n*. undef Stirst if dafincd Sfirtt: 
prir.C STOERX: 

Sdtbu9> 3) I 

^Tiz'.i 5TDERR -Calling closilDXmXa* : 

cicstlDirri: 
li iSdabuvl ( 

print STDEKR 4printid. 'Error S? £roa eiosalDXFFtM:' if S?: 
: tlsif iSdabug > 2> ( 

prin: STDERR tpriatid. •clottlDlFFl OR\n': 

kill £a::vtpid: • SIGUTT 
:: tvaitpid(Salivcpid. OH ( 

;rir.t STDERR fcprxctid. 'warair.?: unacle £o vaic 2or child katpalive process Saiivepxd\n* : 

-e •ScirrMt.diff) { 
If i-s J { 

Sccntenc - 'cat ScurrcccdiSf'.- 

Scooceoc a*</htal>t!i: 

print Scocteot: 
.* else ( 

print "lo dificrtnces eocountered<br»a' : 
Sdif ts 0; 

) 

} else ( 

texxt^r«ce!uilyf!lo diff tile Scurrent.diff created*. 0): 

I 

1 

If !Sd;f!sl \ 

Strailer e/page/aodilied page/: 

i 

Saovec = qq!(llote that the URL previously provided. <A IIREr>*$url'>Siirl</R>. moved here.)! if Snewurl De Surl: 
Snaybe ■ q:;: You should view 

the current version <stro&9»directiy</ttroiig> it you wwt «• 

href «*http: / wwwll2 6 .research. att.coa/-dou«lis/track.urls/'>wliwwtr</a> 

to Know the page was sees. 

<li> <x KR£Fs'SsDapshot?aBail«Sttserattrl«SurUcype«ReMBber*>ReMiber eurreot version<vA>.<br>*. unless Suser eg "tr^ts^anges* 
Strailer .« «fOP: 

<ui> 

<li> <A KRE?s' Snewurl *>view current vertioo</A>. Saoved 
Snaybe 

<li> <K iS-?>*Ssnapsbot?eMil3SttserSurl«$url4type*Hitcory*>Sea the version hiatory</A>.<br> 

</ui> 
</HTML> 

sor 

print Strailer: 
) elsif t Separation /RoMBber.*/) ( 
if (Sexisted) ( 
it (Sdebug) ( 

open CSAVZm. '>4SnZIR').- 
open ISAVZOW. •HSTDOOT'J; 

pipe IRCSDAS. RCSOOT) || 4e3titjraeefully(* Can't create pipe** 0): 
close I STDERR) ; 
open (STDERR. '>tRCSOW); 
openlSTDOUT. •>iRCSOOT"); 
> else ! 

fcr.t:isToour. sr.sirro. sn>.CLO£XEC) \\ 

fcexi:_9racefttlly{ 'Unable to set close-on-exee Uag for stdout*. 

0); 

fcstiiSTDCRR. sr.srrn). sro.cuEzzc) || 

fcexiuracefully I 'Unable to set doae-on-exec eiag tor stderr*, 

01: 

systeacsrzs -1 SrcsfileM U fcexit^racefuUyt 'Unable to lock SrcsCile*. 0): 

;£ :Sdebug: ( 

ciotctRCSOUTI : 
ciose(£TDERR} : 
ziosclSTDOlTT) : 
CpeclSTZERR. •>iSfcVtniR'>: 
rpexlSTXLT. •>kSAVECIUTM : 
tiose.SAVEEBRi: 
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local (Swarn) = 0: 
while «RCSE]UIS» i 
if CSwarn) ( 

print STDEWi 4prir.::d. *Htttagt trco rcitxn*; 
. Swan > I: 

J 

print STODUl: 

J 

) tist ( 

fcttKSTooOT. sro.'Em). 01 H 

*exit_gr»cefuUy{*Unablt ta cUar clo$e-QD-Mtc eiag tzt sidouf . 

0); 

icntKSTOEW. SFD.srrn). o> :l 

4txicjrac*£uliyC0naJbie sa clear slcs€-on-«3cec flag fcr acdarr*. 

01: 

) 

) 

OtBp » fSci*. •Scurrenc"!; 

if Idcfistd Shtaders{ 'lftst-scdifiee' i 

Sdact * *$haadars{'la»c-eodifiec'r: 

I c'naeic for a cartaio »tyU RCS cq»»= ' li". a* in: 

• Uft-acdified; frxday, :6-HayH :3:19:11 QT 

if iSdatt =- '•(\w\w\w>\w. {\d-.Cf-i\w»t-l\d\d»/l I 
Sdacc » -S;. S2 $3 1«4 $'*: 

I 

spli«*Ot=?. 1. '-asdati'l; 
} tlte ( 

Sdatt B fcctiseiciM); 
chop''iace»: 

) 



• not cltar if wt rtally ntet zo stnd data to res - leavt as-n for now. 

• Scurr«nc3 5/(|•\•l\S^W\\$l/o: 
• locall$c=d) . qq:\Spid = iduplejcpiptfcirr. •IMS. V$qurrmJ\") J ; 
local{$c«i) « QQ!\Spid = *dupiexpip«CCWT. *:», •ERRS. \it«p)!; 

priDtf STDERR 'Evaluating cosnand %a\r.*. Scod if Sdabu9 > 2; 
•val 

if iSOl ( 

4txitjrac« fully (SO. Oh 

) 

if (iSexitttd) ( 
local (Stitle): 

printf OUT 'Tbi* is a laapthot of a page. URL U\n*, $url: 

t would be tioplcr m perlS... 

if (Stitle ■ 4htBl_titlei$conccnt)) ( 

printf OUT •<or><i>Title: %t</i>\n*. Stitle: 

) 

• *^JIiitf Otn qq:A snapshot made by <a iDttf.'S»Bapstot7««il»%»*type«List^X>%i</«>\n! . 

) 

close (Ot/T): 
local ($vam) > Or 
Sunchanged « 0: 
while kHO) { 

if lSdet>u9» I 

i! {!5MTn> I 

print STDEM *princid. -Heaaage froa ci 8tdout:\n*; 
StMn • I: 

) 

print STOCBR: 

) 

if t/*fiie 1$ unchanged: reverting/) { 
Sunchanged = I: 

) 

) 

Swarn s 0: 
while l<ERRS» i 
if tSdeoug) ( 

if fiSwami { . a . . 

print SIDERR ipnntid. •Message frcn ci stderr;\n ; 
Swan > :.* 

;rint STDERR: 

! 

) 

c lose (IN) : 
cioSe(£RRS^- 
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it ( !ttaiipid(Spid. 01) { 

prisi STOZnil fcpristid, 'wnicg: unable co waic tor child proctts $pid\o*: 

) tlM { 
STA1US: { 

local * S?: 

/*lt2401$/ 4a do I • an ok Katus? 

princ snnoui ApriDCid. *waziuog: wucad wich • scacus of $?\n* 

if Sdobug: 
last STATUS; 

I: 

/*0S/ U lUt STATUS: 

fctxit^ractfttllyfci txittd with a stacua of $?*. U: 

} 

) 

&exLt^racafuUy(*Probla» witb 'Ciaas' file*. 0! if SIcims 0: 
Scutes s $Ci9ti(0): 

• fcexujracafttllyCfUe Sroot/Sciatt not wricable', Oi i! ; -w •Sciaas*: 
(Suncoangad) ( 

Sscatus > 'Noca. file was UDCtoinved fron prcvicui eaack-io.'; 
; «isf ( 

SsMtus s 'Cback in was successful . * 

) 

pxint qqiSscatut You can view ibe <A KllEr«'Scri*>egrrar.i version<iA> or see the <A KREr«*Ssoapahoc?caBil»Sustr».;:>£^r.i:-;Tet.-:^ 
^ istcry >versacR hiacory< /A> . \n : ; 

print Straiitr. •<p>*: 

c; ntTZM£S, '»$tuMS*) i| •axic^racefuUyr Unable te cptn Sroot/SciMa: '!*. 0): 
vrnw (tSoead.locU || aloeklTZISS, 0)1 ( 

print STSOtt aprintid. 'Unable to acquire lock on Stiaes... writing an) ay.\n*; 

) 

print! TIMES *%a %8\n*. Surl. Sdata; 
close ITOOSJ: 

aclMBup.av; 
exit 0: 

I ) elsif (Soperation eq 'List Jill') < 

I elsii ($operatioa eq *History*l I 

local (Scad) « -/usr/sbin/perl SrlogZhtal Ssnapdir/Scurrent Surl Suaar*: 
Sod *- s/(t\»%\Sl)/\\$l/g; 

printf STDOW *%sIavokiog %s\n*. Aprintid. lad if Sdabug; 
opaalUOC. -Scadl'l tl 

aanc^raeaiullyrcan't invoke Srlog2btal*. 0): 
while 1<UM>) I 
print S.: 

1 

close IRLOOli 
if ( . 

if (Sdabug) i 

print Snm Apriatid. 'Error S? fron cloaelKMGlNn* : 

) 

print 'Tbaxe was so error invoking rlog. *; 
) else { 

printf snm 'close (UOG) OK\n* if Sdabug > 2: 
prist f qqtTou can see a 

<A inzr«*%s?url>%s4cypa=aRL%%30usar8&aaail«%s*>li8t of usars</A> wIm heva recorded tbia paga.<br>\n!. Ssnapahot. Surl. Suscr: 

printf qqiTake a <A nt£F»*l8?urls%sAcype«fteBenDerAeatailB%s*>aBapsbot</A> of cha currant page. (You will :c c==si zc ^ 
<^the list of users if you ar« not already there. )<br>\ttu Sanapshot. Surl. Suaer unlasa Sutar eq 'tntecbangcs*: 

» 

print Strailer*' 

) 

(cleanup.tBp; 

I taken froa htal-parae land revaraad) 
sub substicttte.encicies ( 

local (S St ring) * 9.; 

$scring s/A/4eap:/g; 

Sscring «• s/</Alt;/g; 

Sstring » s/>/A«ti/g: 

Sstring m- s/\*/fcquot:/g: 

Sscring: 

} 
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I priac STOCVT * *: 

• pjisc STCERR 'Alara. ..\n'; 

I SSIGI* ALRK- s 'htaidiff. waiting'; 

t| 

sub byusercounL ( 

$count(SA) <«> ScounclSbl: 

» 

sub Title ( 

local tSi) * 

rtturr. •<in?a><TITLE>5t<;TITLE>\n<B0DY><Kl>$t«/Kl>\n*: 

) 

sub cltanup.usp ( 

it tStcapfilts " 'I kt Sdtbugl { 

princ STCO« kprxttzid. 'No teaporary files ts :le«n us.'.nv 

> 

iz: (9tsp£iltsi • 
if 5.1 { 

print STOERR *priat:i. "Jnlinking S_: •. Is -1 S. if setr^g; 
unlink iS.) ;| warr. 'Can t unlink 5_: $!': 
: tls« { 

prist STOERR tpriatid. *<_ dOMn't utist \a' i! Sdtbug: 



sub bydaaaia { 

SdemainktytSa) » idoaainkeytSa) unltis dtfined (SdcsainkeytSaM ; 
SdeaainkayfSb) • CdoaAiaktytSb) unless daf»ed (SdoaaiAkcyUbM: 
rtcurr. SdonaiRktyi $a ) esp SdoaaicikeylSb); 

I 

sub do&ainkey ( 

local(Sutl\ • 8_; 

local <Sscne»e.$addrtss.Sport.Spath.$qusry.$fragi • awwwuri'parselScrll? 
ualessisporti ( 
Sport s 60: 

Saddrtss =- »/i.)/M\l/g; • io«tr cast - btccer way? wbart's ay perl bock- 
local l«addr) » split l/N./. Saddrtss): 

return sprxnt;r%a://%s:%d%s*. SsebtM, joinC*. raversal«addr)). 

Sport. Spadkl; 

I 

sub byustrcountUIU. { 

» Tcvtcsc order of count, tbeo ooxmX order of ORL within count, 
prin-.; STBERR '\%t\6\ v *si%d»\n*. Sa. $count($a). Sb. SceuntCSbl 

if Sdebug > 3; 
localiSres): 

if iScouatlSa) !> $couBt{Sb)) ( 

Sres » ScounttSb) <«> ScounctSa}; 

priatf STDEHR •\tCouai: Sresxa* if Sdttaug > 3 ; 
) else ( 

Sres ~ 4bydoMio; 

prxnrf STDDW "NtURL: Srtsxn* If Sdebug > 3: 

) 

return Sres: 
« local iSa.Sbi « ($count(Sa). ScountlSbl): 

) 

sub url2fr. ( 

•ccalJSurll « 9.; 

local !$f cut •) • Surl; 

Sfr._uri t- sr'http:// 5 ! : • »trip 

Sfr. uri sti.'— )f*!g; • <la«en 

print* STDERR "Jsing file oasM \"%s\* for %s\n*. Sfn.uri. Surl if Sdebug > 
return Sfe.utl: 

! 

sub rcsfile ( 

locallSfnl * 9.: 

return ■ pages /RCS/Sfn.V; 

I 
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• Local Variablff: 

• nodi: ptrl * 

• End: * 
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i'.'t:r. ss ■ « cosBtsc co k««p ptri from bting confused 

eva*. fixec perl -S s: S{1«*$?M* 
.J • . 

push .fi:.':. \-hst/dcu9lis/Ub/pcrl*): 

Src:: - * ^csc.ccu^Ua/cz^/ccittsc/snipihoc* : 
SbACur.: s * -::££•; dsu;lis/bin;noaandsec* .- 



si.s:: £:l.d:r.i: 



:hd:r ;r::: . •cx::_;rtctfullyf*£r;ar m shdiriStooEi : Sf. 3J: 
^t«. : I stc all access bizs on 

9All:*uls.:pera:;:r.s : :*licne9Mr'. Oi!!'. 'Sucus*. 'Use All*. *Mitur" "Jsezf 



it aev*. 'All URLs'. 'CilL users'. Vitw :.rr«^ 



i?.*id?Brse.'* input; ; 

if .itfir.ed(Sinri:i!-:ypf))J I 

:t (Qi(;nedlS;nputi 'aMirn) ( 
Soiling ■ **: 

• tlsi 

iiRTJcCaMii'} s *$or/{'M)snz.uscR*iMscfvrRBmji05r'i* : 
StHilng 3 ■i«i>iAseri youz ustmiM betorc youx hoscoaM</i>»*: 



Scpcicn_5Crin9 s ••; 

:sr t9aUQw«Qie.operauoRS> { 

Scpcion_suin9 '<opuoii> ': 

print IPritttHMdir; 
?r»t qui 

<bc&:> 

<titif>JiC )tt»DS</title> 

<h2><D!: ALICHruddlt SKCsViogos/atlbUogo.gif * > 
Sotuazft tad Systacs at$«arcb</h2> <br> 

<ia>NC i»KDS</10> 

<iag aiignsTiQht- aU'*(lX> HAXDS I090I* siC3*/>douolis/ne.biiids/logo.glf*> 

<h2>Ui:r^ Ttii ror»</h2> 

This 'irs :j used tc interact vitb tha <a 

href>* -cou7xu.'no.nands/*>NO HA)iD5</a> facility, You can rtMBber 
wtut a pA;t pointed to by a UXL looks like, so you can ncnzn to ic 
later sx see how it has cbaaeed. I In Neucapt. you can «ittr tbt ORL 
mily 7/ holding ccvn the rxgbt button ovir a link and ttlectiag 
<ca>C:7/ tais lirJc Ixation to cli{>board</ca>. Note that m Netscape 
l.IN. :f yoti dou^ie-click on the URL in ibe <eB>Location</en> line of 
^ne p£;e yoj -dtr.z track, when you com back to this page and try to 
paste Hi '.TL notcin? will happen, j You nay view the differences 
bctvet:; t=e xs: recent version you sav and the current vtrsion. or 
see r.;stcrv cf past -/ersions of the page (not only the ones you 

saved av&y;. Yeu aay also see all URLs you have saved avay. or see infotuticn aoeu: ::r.er users cf the facility. 
<P> 

Hote trit r:rrer.::y there is no protection: anyone can use any 
"em; hcaxtiS ' and can vicv each other* s intomation. 

<p> 

Scoe i:r:=e3tati=r. about tne <a href«*/-dougiia/no.ttands/*>NO KANSS<>a> 
!ac::;'.y srji zij^i <b hreff /.douglis/oo.hands/bclp.ctnl*>foni«/»> &n 
pazti:.::ir are available. NO HMDS works best 10 cenjunctton vitb an 

a-jtn:it:: <k HAs?8"-dcugUs/track.urls/'>notification systc»</A> that tells you tAat a page nas changed and 

;r5v:fsi tr.c Vrl access SO HMCS Cor that page. 

<?> 

<::r= r«tzrisr:r:> 
<?:«> 
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»?> iu;: ic=r»s: <ir;-ji nM»»*eaa;;' types 'text' vaiue s Sinputt'eaixl' ) size«49> SAuiiasg 
<7> :7«ri::cr.: <s«i«ct nt3tts*typc*> 

<;> 



<?><:> Jr^s r ::r4: r:st:: 3a< / 1 ><br> 

ir*fti?ars9 'input:. 

• zz-^.: ye: ;:^ieatnttc. . .<?>•. fcFrir.iVariifiieiJlinput} ; 
-.-.Iftsi *;tsp.«* 'w 'p*w* -d *pcopl«' M -d *9««ts*: < 

• spin:TEST. •>w»t2iie')» i 
■ ?r;r.:: tKt ditB , 

• ;r:r.:f rUT *7.^is itsi succMoid. .r.* : 

• Isci! !$!oci: 

• > Srccci s!;hoM/f/lioBtl. : : 

• pr;r.:* tQ;.£ucccss:.:::y wro» <A »£F:-!ilc:ls;ttst!ilf>tMc!iit</A>.'.r.!. SScoi: 

• *;tt • 

• ;;ir.:f **Jr^lt to ep«n ctstUlttr.*: 
• 

««xi:_;raci£-.llyi*c;;cist/peopic or c«i/p« ts not writable direetery*. 0); 

:! ::dii;ned &:sptiertmail*) || 

difi.'tcd $:rpucl*cype*| [! S input (*tMil*) cq **) { 
^ i«x;:_;raei£vl2yl'CCX script invoked vichwt sufSieient POST paraMcers, eaailsSicputl eBaii' I, type* Sinpucl 'type')*. 1): 

uitdtf S»r.i:e: 

fer StaUowabie.operacionsi ( 

Sw-oert s S_. lut it Sallovablt.operacionslS.l eq $inpuc(*cype*l: 

fccx^:.;r&:tfullvt*:ilfr9al operet;on $ input I 'type) spccilied*. l) 

'jsless defined S«borc: 
'Aiess lirpuc type'} *Vit« Current *) I 

?r:r.-- iPrintifeader: 

;£ :5ir^-::*:ype'l eq -ReoeacerV • 

^Z'.zzt *<7:tie>ReMSDer • 0P1</Ticle>\n*.- 

pr:r.:i *<Hl>aaeooer OXL: li</Hl>\o*, Smpucfurl*}; 

) 

unlesi »$;npgt{»ea*il'l =- /*\w( I |i S input ( 'type* ) eq •OiersM i 
kexit.;racefunyt 'Script tecainated due to ebnonal iivut*. 

•eMii addresa nut contain only elphaniMrici, <ltbd>.</ktad>. «ktd>9</ltbd>. or <kM>*</)ifad> and my not star: v::s a <i(fac^ 

} 

if :S;r;ju:rw) eq •Renmber' }| S input ( -type' » eq "Diff || 
S;r.r-:l '07e ) eq 'StAtuS') { 

♦^U$£ :Ur^t(*t:rl*} »?''littp://(\¥\.\l«-/V-:I»$?) I 

iexic^r«ccfully(*Script terwaaced due to abnenal input*. 

spr;5t: ris toCfending UXL: \*ls\-r. 'm. u restricted to <lcbd>bttp://</kbd» {oUowed by elpenuaberics or <kbd>/ K-b; :</i(bd>' 
Sinpat(*urlMn: 

t 

:: '.l:r.;u;! type- j eq 'Jifi' ii del inedlSinpuct version'))) I 
Itxzzt s Sinput? 'version' I; 

:tx-,ra r.c *pecuU;aace* Seztra "'.d^x.Nd'S/l ( 
;«x::.;racefsl:7f*Vers;cn tor diff specxSicd as Sextra not aeceptsble*. 



i^-f : $/ 

«xe: irtr.'.fr^, Sinputr type* • . Sinputl •esail' ; . Sinputrwl i. Sextra; 
;r:-: *'".r:rx>£rr3r prscessir^ C9iinAnd.</strsr.9»«n* 
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• ■/bin/sh • a cosnent to keep perl froa being conhistd 

• Ttus script invokes hcAldif! or rcsdiff on sulLiplt versions of i 

• tilt. It can fticher run on • local file (filc«*user;!ile) or a 

• snapsAoi of a file Uiitsbttp:;/ . ..K Tbts* art trtacid socBtwbat 

• differently. 

I ArcjAlly. use di!f instead of rcsdiff . atjA check cue fcy hand, co 
i keep the iitt code luelf the s«m regardless. 

eval exec perl -S SS 5(l»*5f!' 

;f 0: 

Sdftbu? s 1: 
Snccd. Locks * 1.- 

Ssoapscct » *h:tp:. ;wwv::26.re9earcn.aci.csa/cgi'bir.;r.s_hanas* : 

imshift f9Z:K. ♦/boffle/coi;giij. :lt^perl•^; 
• arcaiiecfjte-specii;: iibrar/ tfor sys /socket. pr.- 
require 'arcn.pl-.- 
Sarc.*: « kazct: 

iJ i-i '.Sliblcc = "r.==c/doasiis.'arch/Sarcr..'lib/per:*' . 
tstshifttllNC. Siiciici: 

) 

t What vtrsion arc ycu using? ise: this if using Swjrewer_dxr» 

Slit.vwv.ver = •0.40*: 

unshifc(8INC. 

ScNVi-uevAM.tm'i J * iQM/dougUs/Ub/pezl/lirrfM-perl-Slib.wv.ver*. 

I: 

if (defined SDWCPAW) U SW.M'PAW)) ( 

• print snm 'f»zz: $OIV( 'PATH' )<te>* if Sdibug: 
) else i 

SENVf 'PATH'} « '/haMrdou9lis/ireb/Sarcb/bin:/bOM/douglis/biA:/bia/ :/ttsrysbin:/usr/bin:/usr/ueb*; 
) 

unitssiSDwrhttpjroxy'J) t 
I $lHVrr^jro«y')«*att-coa*; 

• want researcn.att.coa to go outside 

Slwrno^rojcyM«*tbu. act. coa.ncr.com. radish, research, act. cott.w»»l 12 6. research, act. coa. ih.att.com.ho.att.co«.cb.att,csa.r.t.att.C2 

SENVt ' http_ptoxy * ! 3 ' hccp : / / radish . rtaearcn . att . cca : 8000/ * : 

I 



reqoxre 'cgi-lib.pl*. 
SBUT.nt * *cleaflup.t8p*; 
require *cqi-ezit.pl*: 
require •cgi-alaim.pl*: 
require *cgx-canon.pl*: 
rcquire 'nonalize.htad.pl*: 
require -hotcaaM.pr.- 
requirc ■vww.pl*: 
require 'lock. pi' if Sneed.locks: 
require *printid.pl*; 
require 'ctiae.pi*: 

select ISnOOTl: S) * I: 

print (PrmtHetder: 

• arch: tec ture-specifi: library ;for sys/ socket. phi 
require 'arch. pi': 
Sarc.-. ■ larch; 

if t'i ^Slibioc 7 * hsm/dougiis/arefa/Sarch/lib/pcrl')] i 
unsaiftl9ZKC. Slibloci: 



reqtiirc -findprcg.?:* 

:Sr:siiff.Shtaidi!i.i;i;fss4f;r.i;rog«Tcsd;f;'. •r.:s;iii;'. 'dilfJ: 

&exi:.;ra:e£ui:y:*UhB£;c ic:a:» resdi2*.vr.\tPA?»SaiV{'PA'n!'!'. Of if sdefined Srcsdiff; 
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king — 16:17 Oct23 

4ex:t gracefully t",Tjflle w ioctw hiaaeiM.'.ftNtPAW^iSDr.'rPMW 0> if !dtfincd Sbwldiff; 
Kxiclgr«ce£u;lv{--'r^l* to locate d:fj. '.n\tPAlH=SEr;i FA:X'r. 0) if idtfined SdilJ; 
Srcjdir = Srcsdiff: 
Srcsdir »- sj/rcsdilfS'!; 

SStUPHOMI : $EMV{'9uv?-} !| -.'hoM/dou9i:s/up/cgiust/siupsboc*: 

SSFOP = •SSttPHCKE.ptgtS*. 
SLOCX s *S9ttPH0K£i locks*: 

•print STOm •rciiiif.cg; enttwd^n* : 

&snapshoc.aucr.or • «ECf : 

*A aRIF5*hti?:;/w.-.rtstirca.ati.rca/or5$/5$r/peopie/d5U5lts/**Prcd Dou9lxs<'A> 
Ecr 

cicplJsnaparxt.authcr; : 
Shtaddiff.auwcr - <tECF; 

«« HREf«'n:;;://www-£;r.ih.att.::a.-tsa»:.">ToB Ea:i<.'^ 
EOF 

c.iop I Shtaici f !_aazr.r r • ' 

Strailer = *<HR><f::.T SIZE-3>T.''.i5 page 9eJi"8ted ty a ZZl Jcript vritten by Ssnapir-ci.autftcr . * . 

Satmidiff_:rail*r ^ using <i>*a hret«'hitp:/'w--s?r.;n.att.coB/-ctoil/htaXdiil.r.csr>ha3ldif:»/a>«. i>. wruiw: sy «.vji.s;..,ii-j::: 
Sb»J s 0; 

::uiechCeci ! * this one's done vizh KSt 

UlaadPar*-*!*:.-^^::. 

nit 1 if -.iftpu-.t'diff) ne iiff; 
dtlcte $in;«ut( ditf): 

exit 1 if •def«r.e3lSinput('fiic')l; 
iliU > Sinpui! filt ): 
delete Sinput{*f;ie'): 
if (defiDcd SicpuiircalCilc'lt 1 
Sretlfile s S;nputCrealfUa'>; 
if iSrtaUile b!^$S1IW/U\w\.\i\*\-:|*)S!1 i 
print; ♦Invalid !ilt specified: U\ $realfiie: 
exit I: 

) 

Sbase.fn s SI: 
LOCK: { 

if {Sneed.locksl { 

pri-nt STDERU iprmtil. •Trying to icck Sbasc.favn* if Sdekug: 
opea(WCit. '*<$LOCIt/$bMe.fnM |] 

princ StBSWt fcprintid. *ltanuiig: unable to open $U)CK/hase.fn.\r.*. last UJCX: 
UlccklLOCK. II II print STOOM kptihtii. unable to acquire lock on $wa/base.fn.\n*. last LOCK: 

) 

) 

delete Sir4)uci'realfila*K 
Sbtal s 1. 
StMM 1 1: 

if (defined SinputCuser')) I 

Suser s Sinputt'uscr*): 

delete Sinpatl'user' ) : 
) else 

Suser > QNRMOMN; 

r 

) else [ 

$WMf = C: 

Srealfilf > &c;i.cancnlSliiel: 
if :$fi<e \.fctaisn { 
ShtKl > :: 

) 

! 

»tap s j'/trp/rcsdaff.cgi.^i.SS'. vtap/rcsdifl.cgi.tS.SS'i; 
Soutput = "trp.rcsififf.cgi.eiff.SS-: 
#tapfues Soutputi: 
bcleanup.ts^: 

Sversions s ?: 

Sverstr « 

local iWersxcr.s . 

!cr Sv !ss:: r;V*rsicn neys tinputt ( 
pushiffver:::-!. Sv»: 
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if (Sivtrsicns It I 

pr:r.ci !*!iusc spteify tuctly cvo versions tc Sot ls\r.*. 

)Oir.(' tvtrsionsi); 

exit; 

t 

Hilti ' t): 
fcr Sv i^vffsionsl I 
li iihimi !| Smm) ( 

I ch*cx ouc ccspomily 

;f <$v e<; 'currtnt*) ( 

if iSMM) ( 

Salivtpid = icgi.kttpalivt: 

print STDZRX &pnntid. *KecpiUve $a!;v«p;s ^1 SfUexn* i£ Sdttaug; 
Sorigurl « Sfilt: 

Sctsponst s (MM-lrtqu*strG£7'. *filt. 'scaecrs. *::iiient:: •* 



kill SAiivtpid; f sxcar 

;! I !Viitpid(S«liv«pid. 0>i { 

printf snCRR *ls vaminQ: '^"vasU to v«;t f:r »iid kctpairvi process Sslivcpid\R*. &prin:;d: 

) 



it (SrispoRse /*2/J ( 

local(Srest) > SwMferr:r'Res7!'.essa?e!Sres;:r.sei: f: 
IdtfintdlSrestn ( 
Srest s •: $rut*: 

I 

Uiitjgracefttlly I 'Response Sresponse curir.; Ctt requests M.'. U : 

) 

print STDOUl ipxintid. *ilomliz:n9 URL SCile — IS Sdefau« 
Scontcnt = 4nozBMiUselitmll$contenc. Sfilti: 
ie IScootent «• ftmt filurxr^ Kno.:/) { 
MxicgracefulIytScoacenc. 0): 

I 

Scsp * popdtsp); 
opcntTHT. '>$t^>-) il 

4iexitj8rKelully( 'Can't open Stop** 01; 
print VSt Scoattnt: 
c lose (Dtp) : 
push < •files. Stqpl: 
) else ( 

push (tales. Srealfilel: 

) 

> elst ( 

* poplltsv); 
systcnrSco -pSv Srealfile > Sts^M kk 

fcexit_9rieefully(*Bmr nuuuag Seo -pSv Szeallile > Stip: $!*. 01; 
push llCiles. %W; 

) 

) else { 

Sverstr s ••rSv • if $v ne •current*; 

} 

} 



(-s $eiUs(01 » -t Sfiltsd) » 
:s/sttat*;bin/ca(3', $fiLes|0|, SfilesUin ( 
print *<h2>No differences encettatered<;h2><br>\n* : 
) cisif (Shti&l II Swiwl ( 

SfileUs: - joinr Ifilcsl; 
:! iSbtBl} ( 

Straiier s/\.$/Shtalditf_crailer./ : 

openiDir?. *SbeBldi£f Sfilelist Soutput 2>kl 1*1 II 

iexit.9racefullyl'Cu't invoke Sht&ldif!: S!*. Ol: 
whilel<0irp» I 

printf snsut S. if Sdebuff: 

C;0Se(DIf7J; 

print! STOERR 'Ezrcr S? Cros closelDIfFJVn* il S? ii Sdebu?; 
openlDIFf. *<Soutput*) |t fcexit.grBceiulIyi'CaA't opin cesporary output*. 0): 
i else ( 

opentDIFF. 'Sdift Sfileliscl*) || 

fcextc^gracefullyr Can't invoke £di££: S". 01. 

A iShtal: 

Sconte::: » join I <OIff>>. '\r.' . 
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cioseiDirn: 

LaollShoiti = £-o»cn4me rQOM: 
Suri » •hi;c:/.'S.':c*t/S£ii**; 
prtnt inomaiiitr-ialiScoaieni. Surli: 
> eisc ; 

?r:r.; Sccntuic. 

prin: >tr«ii«r: 

^* l^itSreMBieer = ?q-*ii> HaEr*'l»M?sMt?*ffAiirS.sir4«ri«Scrigvsi4:ypf*P*Btii©er'>Sei^ current wsiJ»D«*X» <zz> 

:* :Suitr eq *:r:ecnanges*' f • sptcu; cist :cr sow 
SrcMBoer = 




prin: "OF: 



'ii> <k Hrl£fs'Sor:r^ri*»view r-rrerx vBr5i5n<;A> 



<.'ui> 
EOF 



prir.t •<K:ML><Trr^>diff -c $:iit<;'m£> o*K5y>-r.«?R2> 

op«nvF.CS;:rf . 'Sdii: Sveritr UuitiU.". i txii 1- 

wnilc (<RCSDZrF>) ' 

■ di-ocali!y what else is utt'jsq'f 

prir.'.; 
Siines**: 

I 

print •<h2>»o di!ftreRcej «ncouniered</k5>XR* :f (!$liMt:. 
print •«/PRE></3C0y>Stiaiier</ima>\n*; 

) 

) fist ( 

print •<irfSiL><Tir-t>rcJdiff -c Sverftr $(ile</TraE>^ii<BC3Y>'r.<?SE>\n" 

openJRCSrnr. 'Sdiff -c *'/erstr Srealfiltl': li exit 1. 

SlineS'C; 

while i<RCSDIFf>» { 

t de-otalifv -ut eisc is aussing^ 

s/&/&ugp:;S: 

s/</tlt:/9: 

s/>/49t:/g: 

prut: 

SLu)es«-: 

print •«h2>!io difierences encottntered</h2>\n» :! CSlicisi: 
print •«/PP^</B0Or>$Kaiier«/HTML>\n*: * 

1 

fcclcamip.tqp: 

) 

sub cleuiup.csp { 

:! (S*t3t)filcs ^« -1 U Sdm) < 

priRt SISL=J». 4pri3t;c. 'S3 teopcrar/ tiies to ciean up.».r/* 

•or Itcspfiles' ' 

prir.'- STTEUl tp:;r.-.;i. '♦Jnlinicing 5.: *. Is -IS. - iur^z. 
unlirjtiS.: i! 'rtrr. 'Cwt unlink 5.: S!*. 
i tUc i 

prist ST3EM ipr;r.t::. dcesn't exist... .^r.' A SdeiDu?; 



sut syversion ( 

return -1 it it «; 'currer.:*. 
return : :i 5* «; -cvrrer.-.- 
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/homeAball/dougiis-tmp/cgi-bin/snapshot.cgi 

king — 16:17 Oet23 



• '.cin.sn » ■ 4 CflOBcnt to kMp perl ttom otmq conftistd 
ivil »tc per: -i $3 sn»*$i'|- 

jj»r. i *.::»e/'eei;ii«/lib/ptrl'J: 

re?;::i •:;;-:it.?lv 
re?ui:t -rfi-exit-pl': 

i:::: « * :::at.dcu9iis;tav'cgittst;«upshoc*. 



.::AltS:il!:. t select tSTDCMt : 



«-e;t:r:rr-T:. si 



ZT.t:z Jr:;- ieaiz.jreceiully' 'S-ror in s.^iir '.Srcct: : S*" 
:=:jbi.< : . t tct ell ecceit ciis on 

»i:::-.ic:«_c?er»t:cns • "atBiexe:-. 'DiSf '. 'ifvi% . Uit , 



Mistsr/ users Reftsatoet it mw . -AIIUXU*. "JRluseri* Vir.; :.;;re^ 



:! iKeta&et! •: 

i?.e»dfars«'"u:putJ ; 

..ittineiiSicy-^^Ctype- jn I 
.(!etine(l(S;>spttt('eBBil'l>l ( 
ScsBilas) * **: 
; else { 

SuiputCaaii ^ - *SDlV|'llSm.USn')M$DIV<-UH0f7ZJ!0ST'r : 
Seaeilss; • M<i>insert yow usenuae before your haeuiaBe<f i>i*: 

1 

ScpticR.string » 
fox iVallmwle.operscionsl ( 
Soptxsn.struig > *<opcioii> $. 

print (PrintHeader: 

ptiai q;- 

<hta;> 

<'.i:l«>NO KAi(DS</title> 

€h2>*'jK ALIOi«flLi(adIe Sltc>vl09os/«cebilo«o.glf * > 

Sof :vArt end Syiceu ReseerctK/h3> <lir> 

<body> 

<H2»» HMIC5</K2> 

<iB9 eUgr.>*ri9ht- elc-MNO XKHDS I090)* tre-*/-4ettsUs/iio.hiads/lo90'9i2*> 

<h3>UsiS9 Tms toTmtib2> 

Ztii itrn IS used to laterect vlth eta* <e 

are2** -<t3uglU;no_bAnils/*>NO KMa»</»> facility, tou can r i n i fi ir 
vMt a pa;f pointed to fiy e ORL looks like, so you can retim to 
Uter and see nov it lus ctiwiged. tin netscape, you can entet tbe UU. 
easiiy cy hcidis^ down the riQ&t button over a link and selecting 
<eii>Cop/ this Ixnx location to clipboard*/ eff>. Hote tbat in Netscep* 
li ysu douole-click on tbe OIL ia the «ea»tocacion</e«> line ot 
tne pa^e you want :c track, when you com back to tMs pege end try to 
pas:i :» V?^ .-.ctr.ino vill bappen.i You ney view the differences 
seto-etr. '..-.e mst decent version you saw and the current version, c; 
sec ua ziiizrf zi past vcrsicM of th* page (not only tne ones you 

saved avayi ?2u ft»y also see *U UWLs you have saved evey, or set ir.!::3ation abcct other users ol the tacrUty. 
<?> 

acts iT.11 rjrrer.tiy there is no protection: anyone can use any 
«=ei; esdrcss' ' and can viev each otner's infomation. 

<p> 

isBc czrjHntauen awut the <a hrtf«*/-dou9li«/no.liaads/*>i)0 HArffiS</a> 
:&riii:y and tsis <a aref**/-dougiia/no_handB/halp.n;al*>fonK/a> ir. 
part: ruler are available. NO MMIDS works btst in csn^ucction vita a.-. 

aut^::: <K a>Ef«"-dou9lis/track.urU/->notlticaticn tystem</A> t=at -.eiis you that a page has changed and 
;rr.*ilts :.ie OIL to access NO miOS for chat psge. 

«p> 

<!:— =ei.-.2d»P95T> 

<p» < input nasie»*vrl* type- 'text* sise«40» 

'p> -T-i.: i^tess: *ir.put neBe»*eaail* type«*text* value ■ Sir.r-ti *«=*-- • »i«e-40> Sesailasg 
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<.s«ltc:> 

•-•l3ra> 
*?* 

K3U ;tJt p«9i i» thi *b>d«v»lopMnt</b> virsion of ;!» e»cilicy. I! 
yzi avt --ryjEU witb it you ny Jsnd « «ore <• >irt(**/cgi-bui/no.hiadi"> 

« --<p><:>LtscT ;snstruction«/i><6r>--> 
• , 

print iPr:::tTt»ii«r: 

tlse 

;:r.l«s$ -v -pages' «• -d 'peoplf U -6 •p»9tsM t 

• ,c?tr,(Tirr. •>testliifH ! 

• print! TSr: datf . 

» print! T55? 'Tfets ttst succ«t<Jtd. \n* : 

■ :£r«l iSfooi: 

» .$!3o « srooti tittiemt'.fhmmU'.: 

• prin:l r—rSuccMiiaiiy «rott <X iaEF»*:ilt:»«/tMtfilt*>MStfiU«'A*. .1: . Si:::: 

• I tis» ( 

t prin:I "JnabU to optn ttstCiltvn*: 

• } 

Mx;tjr*ct{u:i/''C9ittft/p«apit or C9i/pa9cs not vritablc dirtctory*. 0); 

t 

i: .:£tCince Sispsci-tMil*] M 

*«it.;r»MfullyrCCl script uwoiMd ifitkout «uffici«nt fOST paruttirs . tMil'Siap--:: wil'*. typfSiflpucl type ). ... 
unci! Swhert: 

fcr :s; . sMllcwAtlt.opftZstiensi ( 

Swhcrt < S.. lut :( S*llowaJQlt.opiratioas(SJ tq Suiput{*type*i : 

itxit.gt«ctfull7frnt9»l optrtcion $iiip«tl*typ«'» spteifitd*. ll 

uoltsi dttiatd Swiwn: 
ucitis iSinput'. typf l eq 'Vitw CurwafI ( 

priac 4?rintHt«dcr: 

il jJuipurl'v^-pt'j *»«w«b«rv ; 

pTiOt! •f?i;U>R«t»o«r • WU.</TltU>Vo*; 

print! •«Hl>Ii«M»tr OTL: U</Kl>\n'. Siaput I •«!•): 

wlcss iSisputi'Mai:*) — /-w|\¥«!,rsy II SiapuM'typtM "UMrfM I 

.^t^-r«.«:a;yt;^.pc m»n.uj^4»^»jj^ <»M>..,«td.. or <»M>.</ttd. «d «y «c . 

•! «$a!put:'typt» 'KnaabtT' !| $ input < 'CyP*' > «<l "W"* II 
Sxnputl 'type ) eq 'StatuiN I 

tinius iSinputrarlM »?''bttp;//|\¥\.\t!-/N-:)*$?) I 

••xxt oracatuUyt'Scrxpt t«raiMC«d dut to abnoiMX inpuf, ^.w%^*. 
iprxntf :-%s <offtadin5 URL: \'%fV)V '010. x« witricted to <kM>http;/M/lsbd> £filioi«l toy tlpwiabwriw or <kbd>/ .♦-b! :*/tte>'. 
S&nputrutlM)>; 

:$u:put( :^-p« «? "Ziif « fi«fintd{S«:puiCvtr>ion'))» t 
S«tr* * Sinput: vefiien* 

:! -Stxtr* r.» "ptr-uitxiMtf kfc Sutra ;-\d*\ .\d*$/» « 

fc«xit.;r»ct!ui:yrvtr$icn tor dift $p«ci.fitd as Stxtra net acctptsblf. 

i! Sdtbua- • 

print!r<PRI>^n*l: 

Ji=^-:! -.Ype i - »' 

ex«: s&acuni. Sinr—f'VP*' - Sinputfraaii' j . Sinputl ftrl*;. s«tra; 
prin: •«jtr:n?>£rr:r prseessm? ccaBand.</scren9>\r.-: 
;r:r.; »?::-trrail«r 
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• Local variablts: 
t flodt: p«rl * 
I I9d: * 



90 



wo 97/15890 



PCT/US96/17142 



1/16 




UJ 

I— 

X 
LU 



— CO 

^1 



LU 
O 
< 
Q_ 



o 
o 





LU 








< 


VE 


' 1 


SER 


S0F1 



LU 
CO 
Z) 




wo 97/15890 



PCT/OS96/17142 



2/16 



"Z. ^ 

i I'i 



ui : 
1 1 1 1 



! I 



I I 



T 

I 
! 

i 

t 














LLI 




CO 




3 



wo 97/15890 



PCT/US96/17142 



'J 



^ 



' I i 



T 



! ' » 

I! 



^ r 
9^' 



f I 

> i 



3/16 



T 



•3- 



wo 97/15890 



PCT/US9M7142 



4/16 



>- 

ct: 



Of 

O 

D_ 
LU 
01 



»T-.» 



-J 



rm 



CO 
LU 
O 
< 
Q_ 




wo 97/15890 



PCT/US96rt7142 



5/16 



ct: 
O 



o 

Q_ 
LU 



if) 
LU 
O 
< 
Q_ 




> 
LU 



LU 
i— 
X 
UJ 



CD 

m 
o 
< 
a. 



LU 



LU 
O 
<. 
Q. 



— CO 



I i! 



I : 



< CD 
LU LU 

o o 

< < 

QL Q_ 















lO 






DATE 




DATE 




o 
















cn 




DATE 




DATE 




DATE 



CO 
LU 
O 
< 
CL. 



CD 
UJ 
O 

< 
X 

o 



wo 97/15890 



6/16 



PCTAJS96/17142 



• . ' .lIT.^T '. *. 'l , ' . MUM ' rrVi'TT--! — i i i i - n r ' i r -' i ■ "f i 1 1 i ii ' i t ii i I ' l ii ' i ii i i 'j 11111.1L- '■ ■ .'i . , , , , , , i . ,„ 

. ^.^^^ ^ ^ ^ ^ ; 

Vmi<mhi5tory ' . /r. 




wo 97/15890 



PCTAJS96/17142 



7/16 




Projects/Labs/Groups 




f\t/ M 



wo 97/15890 



8/16 



PCTAJS96/17142 



C 

o 

LU 



WIT 



mrm 



CD 

UJ 
O 
< 



S£ <! 
> 

Ui o, 







lO 


LU 




LU 








< 




< 


Q 




a 









LU 




LU 








< 




< 


o 







I < 
!0 



LU 
O 
< 
0^ 



LU: 

< 

Ql 



i: 1 



LL* 

I— 

X 
LU 



3 !=f 



< CD 

Li. LU 
O O 

< < 
a. 









rvi CO 


< 


o 


cc ^ 


LU 


LU 


Si 


PAGI 


PAG 



wo 97/15890 PCT/US96/17142 

9/16 




wo 97/15890 PCTAJS9firt7142 

10/16 




wo 97/15890 



11/16 



PCT/US96/17142 




wo 97/15890 



PCT/US96/17142 




~r ! 

! I r- 

T 1 T 



< 
I— 
< 



LLI 
> 

a: 

LLi 



01 ; 
U: : 



X 



CD 
< 

a. 



< 

















uu 




>— 








< 




DA 









1— 






— 


< 






PAGE 


PAGE 





lirt 



! l< 



€C LU LU 




O 



wo 97/15890 



14/16 



PCTAJS96/17142 




fllr 11 



wo 97/15890 



PCTAJS96/17142 



15/16 



http://snapple.cs.i¥ashtegton.edu:fiOO/moblle/ 
File Description 

Venion history 



Sitoeil 




1 

ft ■Willi 


1 0 janm 




[Militeomimiift 




1 a ^ 


IffMMM 30:44:4} 






1 " h 


imom Hilly [titiiwNito 



BBBBB 



The following URLs have cbugid: 




n'lMiMTTtr'nuai 

VImmm mm\^ i i i MiUm 



KlBBij 



ThcMteitefUiat 







in—Kut 


iBHteiBiBaB 



ThefDlMBS URU hmMt 



wo 97/15890 



PCT/US96/17142 



16/16 




INTERNATIONAL SEARCH REPORT 



InV tonal Application No 

PCT/US 96/17142 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 6 G06F17/3e 



Accortint to IntemaDonal Patent OaMficanon (IPQ or to both natiofiil dasaficaoon and IPC 



I. FIELDS SEARCHED 



Minimum doauncntatian seardtcd (danficauon syston followed by damficatton symbols) 

IPC 6 606F 



Docwnenunon seairhcd other than nummum documenuiion to the extent that such docuancnu ai« included in the fidds searched 



Electronic dau base consulted dunng the international search (name of dau base and, where practical, search lerats used) 



C. DOCUMENTS CONSIDERED TO BE RELEVAKT 



Catefory * QtaiioD of documcni, with mdicalioii, whoc appnpnaie, of the rdcvant pasaces 



RdcvBMt to datm No. 



P.A 



PROCEEDINGS OF THE USENIX 1996 ANNUAL 
TECHNICAL CONFERENCE, PROCEEDINGS OF 
USENIX, SAN DIEGO. CA. USA. 22-26 JAN. 
1996. 1996, BERKELEY, CA, USA. USENIX 
ASSOC, USA, 

pages 165-176, XP0eG616939 

DOUGLIS F ET AL: 'Tracking and viewing 

changes on the Web* 

cited in the application 

see the whole document 



1-11 



0 



Further docuincna arc ttsted in the cononuaoon of box C. 



□ 



Paicm family members are listed m annex. 



' Speail catK|ohes of ated documen tt : 

'A' document defining die fcneral state of the art which is not 

considered to be of particular relevance 
'E' earlier document but published on or after the mtcmauonal 

filing dau 

'L' document which may throw doubts on pnonty daim(t) or 
which ts ated to establish the publication date of another 
atation or other spedal reason (as spedfied) 

*0' document referring to an oral dasdosure» use, exhibition or 



imcroabooal ftlmg date but 



document published prior to the 
Uier than the pnonty date claimed 



T' later docunMnt pubttdied after the mtemational filing date 
or pnonty date and not m conflict with the application but 
dted to uiderstind the pnndple or theory undcrlyuig the 
invenbon 

'X' document of particular relevance; the daimed invention 
cannot be consdered novd or cannot be considered to 
involve an inventive step when the doctimeDt is taken alone 

'Y* document of particular rdevancc; the daimed invention 
cannot be considered to involve an inventive step when the 
document is combined with one or more other such docu* 
meats, such combination being obvious to a peison dolled 
in the art. 

'A.' document monbcr of the nme patmt family 



Date of the actual completion of the intanabonal aearch 



3 February 1997 



Date of mailing of the intemanonal search report 



Name and mailing address of the ISA 

European Patent Offioe* P.B. 581 S Patendaan 3 
NL • 22SO HV Riiswi}k 
Td. ( 31-70) 340.2040. Tx. 31 651 cpo nl. 
Fax (-^ 31-70) 340-3016 



Authorued officer 



Katerbau, R 



Foim PCr/ISAaiO 



chMtl (Jvly I9t3) 



page 1 of 2 



INTERNATIONAL SEARCH REPORT 



itatr mal AppticAiion No 

PCT/US 96/17142 



C^Cominiialion) DOCUMENTS CONSIDERED TO BE RELEVANT 



Catefory * 



Ctuiion of documcnu Willi mdicabon, where ipprapnatc. of the relevant pasaies 



Relevant to claim No. 



SECOND INTERNATIONAL WORLD-WIDE WEB 

CONFERENCE: MOSAIC AND THE WEB. CHICAGO, 

IL, USA, 17-20 OCT. 1994, 

vol. 28, no. 1-2, ISSN 0169-7552, COMPUTER 

NETWORKS AMD ISDN SYSTEMS, DEC. 1995, 

ELSEVIER, NETHERLANDS, 

pages 147-154, XP000616707 

J0N6-6YUN LIN: "Using Cool lists to index 

HTML documents in the Web" 

see the whole document 

PROCEEDINGS. INTERNATIONAL CONFERENCE ON 
TOOLS WITH ARTIFICIAL INTELLIGENCE, 
1 January 1995, 
pages 492-495, XPO00567438 
PAZZANI M ET AL: 'LEARNING FROM HOTLISTS 
AND COLDLISTS: TOWARDS A WWW INFORMATION 
FILTERING AND SEEKING AGENT" 
see the whole document 

PROCEEDINGS OF THE CONFERENCE ON 
ARTIFICIAL INTELLIGENCE FOR APPLICATIONS, 
ORLANDO. MAR. 1 - 5, 1993, 
no. CONF. 9, 1 March 1993, INSTITUTE OF 
ELECTRICAL AND ELECTRONICS ENGINEERS, 
pages 345-352, XP000379626 
BEERUD SHETH ET AL: "EVOLVING AGENTS FOR 
PERSONALIZED INFORMATION FILTERING" 
see the whole document 



1-11 



1-11 



1-11 



Foim PCT/ISA/aiO (CB m iftwi u cn of mcdm tliwi) puly lf93) 



page 2 of 2 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 



LJ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 1 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




